HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code

3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code

Code Generation

Yipeng Gao, Lei Shu, Genzhi Ye, et al.

RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

Medical Imaging

Visual Question Answering

Leo Butsanets, Charles Corbiere, Julien Khlaut, et al.

Training Software Engineering Agents and Verifiers with SWE-Gym

Supervised Fine-Tuning

Jiayi Pan, Xingyao Wang, Graham Neubig, et al.

MAKIEVAL: A Multilingual Automatic WiKIdata-based Framework for Cultural Awareness Evaluation for LLMs

Text Generation

Raoyuan Zhao, Beiduo Chen, Barbara Plank, et al.

GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning

3D Machine Vision

Retrieval-Augmented Generation

Haoyu Wang, Guoqing Ma, Zeyu Zhang, et al.

Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models

Diffusion Model

Text Generation

Yanming Zhang, Yihan Bian, Jingyuan Qi, et al.

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

Diffusion Model

Max Van Puyvelde, Ibrahim Gulluk, Wim Van Criekinge, et al.

GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents

Zhe Ren, Yibo Yang, Yimeng Chen, et al.

MemSlides: A Hierarchical Memory Driven Agent Framework for Personalized Slide Generation with Multi-turn Local Revision

Ye Jin, Yangyang Xu, Jun Zhu, et al.

PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models

Diffusion Model

Image Captioning

Yueyi Sun, Yuhao Wang, Jason Li, et al.

Code World Models for General Game Playing

Code Generation

Wolfgang Lehrach, Daniel Hennes, Miguel Lázaro-Gredilla, et al.

Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents

Dhaval C. Patel, Kaoutar El Maghraoui, Shuxin Lin, et al.

S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

Video Understanding

Yalun Dai, Hao Li, Shulin Tian, et al.

Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages

Code Generation

Maria Ivanova, Pavel Zadorozhny, Rodion Levichev, et al.

Playful Agentic Robot Learning

Code Generation

Junyi Zhang, Jiaxin Ge, Hanjun Yoo, et al.

DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects

Tianshan Zhang, Yijia Duan, Yanjun Li, et al.

Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance

Image Inpainting

Diffusion Model

Kangsheng Duan, Ziyang Xu, Wenyu Liu, et al.

EfficientRollout: System-Aware Self-Speculative Decoding for RL Rollouts

Reinforcement Learning

Minseo Kim, Minjae Lee, Seunghyuk Oh, et al.

Trust the Right Teacher: Quality-Aware Self-Distillation for GUI Grounding

Jingyuan Huang, Zuming Huang, Yucheng Shi, et al.

Reinforcing Dual-Path Reasoning in Spatial Vision Language Models

3D Machine Vision

Yatai Ji, An-Chieh Cheng, Yang Fu, et al.

SAE Interventions are Unreliable: Post-Intervention Recovery of Suppressed Behavior

Mingyue Cui, Linghui Shen, Xingyi Yang

Kairos: A Native World Model Stack for Physical AI

Kairos Team, Fei Wang, Shan You, et al.

Guava: An Effective and Universal Harness for Embodied Manipulation

Embodied Intelligence

Haowen Liu, Xirui Li, Shaoxiong Yao, et al.

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

Shengyuan Ding, Xilin Wei, Xinyu Fang, et al.

LifeSciBench: Evaluating Language Models on Realistic, Expert-Level Tasks in the Life Sciences

Amelia Liu, Andrew Ho, Anne Marie Droste, et al.

TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs

Hyeongwon Jang, Gyouk Chu, Changhun Kim, et al.

LectūraAgents: A Multi-Agent Framework for Adaptive Personalized AI-Assisted Learning and Embodied Teaching

Embodied Intelligence

Jaward Sesay, Yue Yu, Siwei Dong, et al.

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

Code Generation

Tongxu Luo, Rongsheng Wang, Jiaxi Bi, et al.

Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients

Reinforcement Learning

Byung-Kwan Lee, Ximing Lu, Shizhe Diao, et al.

ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining

Supervised Fine-Tuning

Hao Li, Ganlong Zhao, Yufei Liu, et al.

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

Code Generation

Jian Yang, Shawn Guo, Wei Zhang, et al.

Predicting LLM Safety Before Release by Simulating Deployment

Text Generation

Marcus Williams, Hannah Sheahan, Cameron Raymond, et al.

3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code

3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code

Code Generation

Yipeng Gao, Lei Shu, Genzhi Ye, et al.

RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

Medical Imaging

Visual Question Answering

Leo Butsanets, Charles Corbiere, Julien Khlaut, et al.

Training Software Engineering Agents and Verifiers with SWE-Gym

Supervised Fine-Tuning

Jiayi Pan, Xingyao Wang, Graham Neubig, et al.

MAKIEVAL: A Multilingual Automatic WiKIdata-based Framework for Cultural Awareness Evaluation for LLMs

Text Generation

Raoyuan Zhao, Beiduo Chen, Barbara Plank, et al.

GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning

3D Machine Vision

Retrieval-Augmented Generation

Haoyu Wang, Guoqing Ma, Zeyu Zhang, et al.

Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models

Diffusion Model

Text Generation

Yanming Zhang, Yihan Bian, Jingyuan Qi, et al.

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

Diffusion Model

Max Van Puyvelde, Ibrahim Gulluk, Wim Van Criekinge, et al.

GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents

Zhe Ren, Yibo Yang, Yimeng Chen, et al.

MemSlides: A Hierarchical Memory Driven Agent Framework for Personalized Slide Generation with Multi-turn Local Revision

Ye Jin, Yangyang Xu, Jun Zhu, et al.

PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models

Diffusion Model

Image Captioning

Yueyi Sun, Yuhao Wang, Jason Li, et al.

Code World Models for General Game Playing

Code Generation

Wolfgang Lehrach, Daniel Hennes, Miguel Lázaro-Gredilla, et al.

Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents

Dhaval C. Patel, Kaoutar El Maghraoui, Shuxin Lin, et al.

S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

Video Understanding

Yalun Dai, Hao Li, Shulin Tian, et al.

Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages

Code Generation

Maria Ivanova, Pavel Zadorozhny, Rodion Levichev, et al.

Playful Agentic Robot Learning

Code Generation

Junyi Zhang, Jiaxin Ge, Hanjun Yoo, et al.

DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects

Tianshan Zhang, Yijia Duan, Yanjun Li, et al.

Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance

Image Inpainting

Diffusion Model

Kangsheng Duan, Ziyang Xu, Wenyu Liu, et al.

EfficientRollout: System-Aware Self-Speculative Decoding for RL Rollouts

Reinforcement Learning

Minseo Kim, Minjae Lee, Seunghyuk Oh, et al.

Trust the Right Teacher: Quality-Aware Self-Distillation for GUI Grounding

Jingyuan Huang, Zuming Huang, Yucheng Shi, et al.

Reinforcing Dual-Path Reasoning in Spatial Vision Language Models

3D Machine Vision

Yatai Ji, An-Chieh Cheng, Yang Fu, et al.

SAE Interventions are Unreliable: Post-Intervention Recovery of Suppressed Behavior

Mingyue Cui, Linghui Shen, Xingyi Yang

Kairos: A Native World Model Stack for Physical AI

Kairos Team, Fei Wang, Shan You, et al.

Guava: An Effective and Universal Harness for Embodied Manipulation

Embodied Intelligence

Haowen Liu, Xirui Li, Shaoxiong Yao, et al.

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

Shengyuan Ding, Xilin Wei, Xinyu Fang, et al.

LifeSciBench: Evaluating Language Models on Realistic, Expert-Level Tasks in the Life Sciences

Amelia Liu, Andrew Ho, Anne Marie Droste, et al.

TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs

Hyeongwon Jang, Gyouk Chu, Changhun Kim, et al.

LectūraAgents: A Multi-Agent Framework for Adaptive Personalized AI-Assisted Learning and Embodied Teaching

Embodied Intelligence

Jaward Sesay, Yue Yu, Siwei Dong, et al.

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

Code Generation

Tongxu Luo, Rongsheng Wang, Jiaxi Bi, et al.

Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients

Reinforcement Learning

Byung-Kwan Lee, Ximing Lu, Shizhe Diao, et al.

ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining

Supervised Fine-Tuning

Hao Li, Ganlong Zhao, Yufei Liu, et al.

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

Code Generation

Jian Yang, Shawn Guo, Wei Zhang, et al.

Predicting LLM Safety Before Release by Simulating Deployment

Text Generation

Marcus Williams, Hannah Sheahan, Cameron Raymond, et al.

Training Software Engineering Agents and Verifiers with SWE-Gym

MAKIEVAL: A Multilingual Automatic WiKIdata-based Framework for Cultural Awareness Evaluation for LLMs

GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning

Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents

MemSlides: A Hierarchical Memory Driven Agent Framework for Personalized Slide Generation with Multi-turn Local Revision

PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models

Code World Models for General Game Playing

Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents

S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages

Playful Agentic Robot Learning

DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects

Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance

EfficientRollout: System-Aware Self-Speculative Decoding for RL Rollouts

Trust the Right Teacher: Quality-Aware Self-Distillation for GUI Grounding

Reinforcing Dual-Path Reasoning in Spatial Vision Language Models

SAE Interventions are Unreliable: Post-Intervention Recovery of Suppressed Behavior

Kairos: A Native World Model Stack for Physical AI

Guava: An Effective and Universal Harness for Embodied Manipulation

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

LifeSciBench: Evaluating Language Models on Realistic, Expert-Level Tasks in the Life Sciences

TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs

LectūraAgents: A Multi-Agent Framework for Adaptive Personalized AI-Assisted Learning and Embodied Teaching

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients

ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

Predicting LLM Safety Before Release by Simulating Deployment

Training Software Engineering Agents and Verifiers with SWE-Gym

MAKIEVAL: A Multilingual Automatic WiKIdata-based Framework for Cultural Awareness Evaluation for LLMs

GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning

Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents

MemSlides: A Hierarchical Memory Driven Agent Framework for Personalized Slide Generation with Multi-turn Local Revision

PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models

Code World Models for General Game Playing

Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents

S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages

Playful Agentic Robot Learning

DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects

Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance

EfficientRollout: System-Aware Self-Speculative Decoding for RL Rollouts

Trust the Right Teacher: Quality-Aware Self-Distillation for GUI Grounding

Reinforcing Dual-Path Reasoning in Spatial Vision Language Models

SAE Interventions are Unreliable: Post-Intervention Recovery of Suppressed Behavior

Kairos: A Native World Model Stack for Physical AI

Guava: An Effective and Universal Harness for Embodied Manipulation

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

LifeSciBench: Evaluating Language Models on Realistic, Expert-Level Tasks in the Life Sciences

TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs

LectūraAgents: A Multi-Agent Framework for Adaptive Personalized AI-Assisted Learning and Embodied Teaching

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients

ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

Predicting LLM Safety Before Release by Simulating Deployment