HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

Continual Learning Bench: Evaluating Frontier AI Systems in Real-World Stateful Environments

Continual Learning Bench: Evaluating Frontier AI Systems in Real-World Stateful Environments

Parth Asawa, Christopher M. Glaze, Gabriel Orlanski, et al.

MEMORY CACHING: RNNs with Growing Memory

MEMORY CACHING: RNNs with Growing Memory

Ali Behrouz, Zeman Li, Yuan Deng, et al.

RobotValues: Evaluating Household Robots When Human Values Conflict

Jongwook Han, Hyeongjin Kim, Yohan Jo

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

Video Understanding

Visual Question Answering

Lin Fu, Zheyuan Yang, Yang Wang, et al.

AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

Jiayu Liu, Cheng Qian, Zhenhailong Wang, et al.

TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

Soyeong Jeong, Jinheon Baek, Minki Kang, et al.

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Woojung Song, Nalim Kim, Sangjun Song, et al.

Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution

Code Generation

Liliana Hotsko, Yinxi Li, Yuntian Deng, et al.

Self-Distilled Policy Gradient

Reinforcement Learning

Yifeng Liu, Shiyouan Zhang, Yifan Zhang, et al.

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

Iman Mirzadeh, Keivan Alizadeh, Oncel Tuzel, et al.

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

Huawei Lin, Peng Li, Jie Song, et al.

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Akter et al., Xiao et al., Liu et al., et al.

Qwen-Image-Flash: Beyond Objective Design

Image Generation

Tianhe Wu, Kun Yan, Zikai Zhou, et al.

OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs

Video Understanding

Yifei Li, Pengyiang Liu, Yuhang Zang, et al.

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Reinforcement Learning

Xuekang Wang, Zhuoyuan Hao, Shuo Hou, et al.

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Jiaming Wang, Ziteng Feng, Jiangtao Wu, et al.

Audio Interaction Model

Audio and Speech Processing

Zhifei Xie, Zihang Liu, Ze An, et al.

Cosmos 3: Omnimodal World Models for Physical AI

Aditi, Niket Agarwal, Arslan Ali, et al.

Learning, Fast and Slow: Towards LLMs That Adapt Continually

Supervised Fine-Tuning

Rishabh Tiwari, Kusha Sareen, Lakshya A Agrawal, et al.

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks

Text Generation

Po-Nien Kung, Linfeng Song, Dawsen Hwang, et al.

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

Visual Question Answering

Yucheng Zhou, Wei Tao, Yiwen Guo, et al.

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

Image Generation

Multimodal Representation

Yuval Golbari, Navve Wasserman, Matias Cosarinsky, et al.

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Reinforcement Learning

Lei Yang, Siyu Ding, Deyi Xiong

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Object Tracking

Zekun Qi, Xuchuan Chen, Dairu Liu, et al.

Trust Region On-Policy Distillation

Text Generation

Xingrun Xing, Haoqing Wang, Boyan Gao, et al.

OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

Retrieval-Augmented Generation

Intelligent Question Answering

Maksim Savkin, Mikhail Goncharov, Alexander Gambashidze, et al.

MAI-Thinking-1: Building a Hill-Climbing Machine

$VLM^3$: Vision Language Models Are Native 3D Learners

Depth Estimation

Zhipeng Cai, Zhuang Liu, Yunyang Xiong, et al.

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

Retrieval-Augmented Generation

Pengcheng Jiang, Zhiyi Shi, Kelly Hong, et al.

DeepCrack: A deep hierarchical feature learning architecture for crack segmentation

Semantic Segmentation

Image Segmentation

Yahui Liu, Lian Yao, Xiaohu Lu, et al.

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Video Generation

Diffusion Model

Hidir Yesiltepe, Jiazhen Hu, Tuna Han Salih Meral, et al.

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Text Generation

Haodi Lei, Yafy Li, Haoran Zhang, et al.

Continual Learning Bench: Evaluating Frontier AI Systems in Real-World Stateful Environments

Continual Learning Bench: Evaluating Frontier AI Systems in Real-World Stateful Environments

Parth Asawa, Christopher M. Glaze, Gabriel Orlanski, et al.

MEMORY CACHING: RNNs with Growing Memory

MEMORY CACHING: RNNs with Growing Memory

Ali Behrouz, Zeman Li, Yuan Deng, et al.

RobotValues: Evaluating Household Robots When Human Values Conflict

Jongwook Han, Hyeongjin Kim, Yohan Jo

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

Video Understanding

Visual Question Answering

Lin Fu, Zheyuan Yang, Yang Wang, et al.

AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

Jiayu Liu, Cheng Qian, Zhenhailong Wang, et al.

TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

Soyeong Jeong, Jinheon Baek, Minki Kang, et al.

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Woojung Song, Nalim Kim, Sangjun Song, et al.

Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution

Code Generation

Liliana Hotsko, Yinxi Li, Yuntian Deng, et al.

Self-Distilled Policy Gradient

Reinforcement Learning

Yifeng Liu, Shiyouan Zhang, Yifan Zhang, et al.

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

Iman Mirzadeh, Keivan Alizadeh, Oncel Tuzel, et al.

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

Huawei Lin, Peng Li, Jie Song, et al.

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Akter et al., Xiao et al., Liu et al., et al.

Qwen-Image-Flash: Beyond Objective Design

Image Generation

Tianhe Wu, Kun Yan, Zikai Zhou, et al.

OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs

Video Understanding

Yifei Li, Pengyiang Liu, Yuhang Zang, et al.

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Reinforcement Learning

Xuekang Wang, Zhuoyuan Hao, Shuo Hou, et al.

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Jiaming Wang, Ziteng Feng, Jiangtao Wu, et al.

Audio Interaction Model

Audio and Speech Processing

Zhifei Xie, Zihang Liu, Ze An, et al.

Cosmos 3: Omnimodal World Models for Physical AI

Aditi, Niket Agarwal, Arslan Ali, et al.

Learning, Fast and Slow: Towards LLMs That Adapt Continually

Supervised Fine-Tuning

Rishabh Tiwari, Kusha Sareen, Lakshya A Agrawal, et al.

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks

Text Generation

Po-Nien Kung, Linfeng Song, Dawsen Hwang, et al.

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

Visual Question Answering

Yucheng Zhou, Wei Tao, Yiwen Guo, et al.

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

Image Generation

Multimodal Representation

Yuval Golbari, Navve Wasserman, Matias Cosarinsky, et al.

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Reinforcement Learning

Lei Yang, Siyu Ding, Deyi Xiong

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Object Tracking

Zekun Qi, Xuchuan Chen, Dairu Liu, et al.

Trust Region On-Policy Distillation

Text Generation

Xingrun Xing, Haoqing Wang, Boyan Gao, et al.

OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

Retrieval-Augmented Generation

Intelligent Question Answering

Maksim Savkin, Mikhail Goncharov, Alexander Gambashidze, et al.

MAI-Thinking-1: Building a Hill-Climbing Machine

$VLM^3$: Vision Language Models Are Native 3D Learners

Depth Estimation

Zhipeng Cai, Zhuang Liu, Yunyang Xiong, et al.

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

Retrieval-Augmented Generation

Pengcheng Jiang, Zhiyi Shi, Kelly Hong, et al.

DeepCrack: A deep hierarchical feature learning architecture for crack segmentation

Semantic Segmentation

Image Segmentation

Yahui Liu, Lian Yao, Xiaohu Lu, et al.

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Video Generation

Diffusion Model

Hidir Yesiltepe, Jiazhen Hu, Tuna Han Salih Meral, et al.

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Text Generation

Haodi Lei, Yafy Li, Haoran Zhang, et al.

RobotValues: Evaluating Household Robots When Human Values Conflict

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution

Self-Distilled Policy Gradient

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Qwen-Image-Flash: Beyond Objective Design

OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Audio Interaction Model

Cosmos 3: Omnimodal World Models for Physical AI

Learning, Fast and Slow: Towards LLMs That Adapt Continually

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Trust Region On-Policy Distillation

OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

MAI-Thinking-1: Building a Hill-Climbing Machine

$VLM^3$ : Vision Language Models Are Native 3D Learners

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

DeepCrack: A deep hierarchical feature learning architecture for crack segmentation

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Draft-OPD: On-Policy Distillation for Speculative Draft Models

RobotValues: Evaluating Household Robots When Human Values Conflict

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution

Self-Distilled Policy Gradient

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Qwen-Image-Flash: Beyond Objective Design

OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Audio Interaction Model

Cosmos 3: Omnimodal World Models for Physical AI

Learning, Fast and Slow: Towards LLMs That Adapt Continually

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Trust Region On-Policy Distillation

OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

MAI-Thinking-1: Building a Hill-Climbing Machine

$VLM^3$ : Vision Language Models Are Native 3D Learners

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

DeepCrack: A deep hierarchical feature learning architecture for crack segmentation

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Draft-OPD: On-Policy Distillation for Speculative Draft Models