HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

Trajectory-Refined Distillation

Trajectory-Refined Distillation

Reinforcement Learning

Li Jiang, Haoran Xu, Yichuan Ding, et al.

MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

Video Understanding

Cong Chen, Guo Gan, Kaixiang Ji, et al.

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Pu Ning, Quan Chen, Kun Tao, et al.

Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

Wenbo Pan, Shujie Liu, Chin-Yew Lin, et al.

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

Xucong Wang, Ziyu Ma, Shidong Yang, et al.

ABot-Earth 0.5: Generative 3D Earth Model

Ming Qian, Tianjian Ouyang, Mingchao Sun, et al.

Kwai Keye-VL-2.0 Technical Report

Video Understanding

Kwai Keye Team, Bin Wen, Changyi Liu, et al.

TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

Multimodal Representation

Zhengpeng Feng, Clement Atzberger, Sadiq Jaffer, et al.

If LLMs have human-like attributes, then so does Age of Empires II

Adrian de Wynter

The Last Human-Written Paper: Agent-Native Research Artifacts

Jiachen Liu, Jiaxin Pei, Jintao Huang, et al.

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

Yan Wang, Qifan Zhang, Jiachen Yu, et al.

LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents

Aofan Yu, Chenyu Zhou, Tianyi Xu, et al.

CoVEBench: Can Video Editing Models Handle Complex Instructions?

Video Generation

Jiangtao Wu, Jiaming Wang, Yiwen He, et al.

Latent Spatial Memory for Video World Models

Video Generation

Diffusion Model

Weijie Wang, Haoyu Zhao, Yifan Yang, et al.

On the Geometry of On-Policy Distillation

Zhennan Shen, Yanshu Li, Qingyu Yin, et al.

SWE-Explore: Benchmarking How Coding Agents Explore Repositories

Code Generation

Shaoqiu Zhang, Yuhang Wang, Jialiang Liang, et al.

VoxCPM2 Technical Report

Diffusion Model

Video Generation

Meituan LongCat Team

ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding

Visual Question Answering

Jovana Kondic, Pengyuan Li, Dhiraj Joshi, et al.

ACL-Verbatim: hallucination-free question answering for research

Retrieval-Augmented Generation

Intelligent Question Answering

Gábor Recski, Szilveszter Tóth, Nadia Verdha, et al.

Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term Memory

Han Zhang, Zihao Tang, Xin Yu, et al.

The End of Software Engineering: How AI Agents Are Fundamentally Restructuring the Software Paradigm

Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention

Multi-Task Learning

Jing Huang, Daniel Wurgaft, Rachit Bansal, et al.

When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents

Dongsheng Zhu, Xuchen Ma, Yucheng Shen, et al.

Direct 3D-Aware Object Insertion via Decomposed Visual Proxies

Diffusion Model

Image Generation

Jingbo Gong, Yikai Wang, Yushi Lan, et al.

AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization

Embodied Intelligence

Yu Li, Menghan Xia, Gongye Liu, et al.

SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations

Taewon Yun, Hyeonseong Park, Jeonghwan Choi, et al.

MMAE: A Massive Multitask Audio Editing Benchmark

Audio and Speech Processing

Ziyang Ma, Ruiqi Yan, Ruiyang Xu, et al.

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

Songhao Wu, Zhongxin Chen, Yuxuan Liu, et al.

ChordEdit: One-Step Low-Energy Transport for Image Editing

Diffusion Model

Liangsi Lu, Xuhang Chen, Minzhe Guo, et al.

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Loïc Magne, Anas Awadalla, Guanzhi Wang, et al.

Efficiently Reconstructing Dynamic Scenes One D4RT at a Time

Depth Estimation

3D Machine Vision

Chuhan Zhang, Guillaume Le Moing, Skanda Koppula, et al.

Trajectory-Refined Distillation

Trajectory-Refined Distillation

Reinforcement Learning

Li Jiang, Haoran Xu, Yichuan Ding, et al.

MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

Video Understanding

Cong Chen, Guo Gan, Kaixiang Ji, et al.

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Pu Ning, Quan Chen, Kun Tao, et al.

Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

Wenbo Pan, Shujie Liu, Chin-Yew Lin, et al.

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

Xucong Wang, Ziyu Ma, Shidong Yang, et al.

ABot-Earth 0.5: Generative 3D Earth Model

Ming Qian, Tianjian Ouyang, Mingchao Sun, et al.

Kwai Keye-VL-2.0 Technical Report

Video Understanding

Kwai Keye Team, Bin Wen, Changyi Liu, et al.

TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

Multimodal Representation

Zhengpeng Feng, Clement Atzberger, Sadiq Jaffer, et al.

If LLMs have human-like attributes, then so does Age of Empires II

Adrian de Wynter

The Last Human-Written Paper: Agent-Native Research Artifacts

Jiachen Liu, Jiaxin Pei, Jintao Huang, et al.

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

Yan Wang, Qifan Zhang, Jiachen Yu, et al.

LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents

Aofan Yu, Chenyu Zhou, Tianyi Xu, et al.

CoVEBench: Can Video Editing Models Handle Complex Instructions?

Video Generation

Jiangtao Wu, Jiaming Wang, Yiwen He, et al.

Latent Spatial Memory for Video World Models

Video Generation

Diffusion Model

Weijie Wang, Haoyu Zhao, Yifan Yang, et al.

On the Geometry of On-Policy Distillation

Zhennan Shen, Yanshu Li, Qingyu Yin, et al.

SWE-Explore: Benchmarking How Coding Agents Explore Repositories

Code Generation

Shaoqiu Zhang, Yuhang Wang, Jialiang Liang, et al.

VoxCPM2 Technical Report

Diffusion Model

Video Generation

Meituan LongCat Team

ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding

Visual Question Answering

Jovana Kondic, Pengyuan Li, Dhiraj Joshi, et al.

ACL-Verbatim: hallucination-free question answering for research

Retrieval-Augmented Generation

Intelligent Question Answering

Gábor Recski, Szilveszter Tóth, Nadia Verdha, et al.

Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term Memory

Han Zhang, Zihao Tang, Xin Yu, et al.

The End of Software Engineering: How AI Agents Are Fundamentally Restructuring the Software Paradigm

Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention

Multi-Task Learning

Jing Huang, Daniel Wurgaft, Rachit Bansal, et al.

When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents

Dongsheng Zhu, Xuchen Ma, Yucheng Shen, et al.

Direct 3D-Aware Object Insertion via Decomposed Visual Proxies

Diffusion Model

Image Generation

Jingbo Gong, Yikai Wang, Yushi Lan, et al.

AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization

Embodied Intelligence

Yu Li, Menghan Xia, Gongye Liu, et al.

SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations

Taewon Yun, Hyeonseong Park, Jeonghwan Choi, et al.

MMAE: A Massive Multitask Audio Editing Benchmark

Audio and Speech Processing

Ziyang Ma, Ruiqi Yan, Ruiyang Xu, et al.

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

Songhao Wu, Zhongxin Chen, Yuxuan Liu, et al.

ChordEdit: One-Step Low-Energy Transport for Image Editing

Diffusion Model

Liangsi Lu, Xuhang Chen, Minzhe Guo, et al.

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Loïc Magne, Anas Awadalla, Guanzhi Wang, et al.

Efficiently Reconstructing Dynamic Scenes One D4RT at a Time

Depth Estimation

3D Machine Vision

Chuhan Zhang, Guillaume Le Moing, Skanda Koppula, et al.

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

ABot-Earth 0.5: Generative 3D Earth Model

Kwai Keye-VL-2.0 Technical Report

TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

If LLMs have human-like attributes, then so does Age of Empires II

The Last Human-Written Paper: Agent-Native Research Artifacts

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents

CoVEBench: Can Video Editing Models Handle Complex Instructions?

Latent Spatial Memory for Video World Models

On the Geometry of On-Policy Distillation

SWE-Explore: Benchmarking How Coding Agents Explore Repositories

VoxCPM2 Technical Report

LongCat-Video-Avatar 1.5 Technical Report

ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding

ACL-Verbatim: hallucination-free question answering for research

Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term Memory

The End of Software Engineering: How AI Agents Are Fundamentally Restructuring the Software Paradigm

Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention

When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents

Direct 3D-Aware Object Insertion via Decomposed Visual Proxies

AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization

SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations

MMAE: A Massive Multitask Audio Editing Benchmark

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

ChordEdit: One-Step Low-Energy Transport for Image Editing

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Efficiently Reconstructing Dynamic Scenes One D4RT at a Time

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

ABot-Earth 0.5: Generative 3D Earth Model

Kwai Keye-VL-2.0 Technical Report

TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

If LLMs have human-like attributes, then so does Age of Empires II

The Last Human-Written Paper: Agent-Native Research Artifacts

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents

CoVEBench: Can Video Editing Models Handle Complex Instructions?

Latent Spatial Memory for Video World Models

On the Geometry of On-Policy Distillation

SWE-Explore: Benchmarking How Coding Agents Explore Repositories

VoxCPM2 Technical Report

LongCat-Video-Avatar 1.5 Technical Report

ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding

ACL-Verbatim: hallucination-free question answering for research

Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term Memory

The End of Software Engineering: How AI Agents Are Fundamentally Restructuring the Software Paradigm

Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention

When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents

Direct 3D-Aware Object Insertion via Decomposed Visual Proxies

AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization

SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations

MMAE: A Massive Multitask Audio Editing Benchmark

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

ChordEdit: One-Step Low-Energy Transport for Image Editing

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Efficiently Reconstructing Dynamic Scenes One D4RT at a Time