HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networks

AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networks

Prince Zizhuang Wang, Shuli Jiang

InCoder-32B-Thinking: Industrial Code World Model for Thinking

InCoder-32B-Thinking: Industrial Code World Model for Thinking

Jian Yang, Wei Zhang, Jiajun Wu, et al.

Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?

Qianshan Wei, Yishan Yang, Siyi Wang, et al.

Token Warping Helps MLLMs Look from Nearby Viewpoints

Multimodal Representation

Phillip Y. Lee, Chanho Park, Mingue Park, et al.

Self-Distilled RLVR

Reinforcement Learning

Chenxu Yang, Chuanyu Qin, Qingyi Si, et al.

A Simple Baseline for Streaming Video Understanding

Video Understanding

Visual Question Answering

Yujiao Shen, Shulin Tian, Jingkang Yang, et al.

CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Ao Qu, Han Zheng, Zijian Zhou, et al.

Steerable Visual Representations

Multimodal Representation

Jona Ruthardt, Manu Gaur, Deva Ramanan, et al.

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Reinforcement Learning

Zhengxi Lu, Zhiyuan Yao, Jinyang Wu, et al.

Generative World Renderer

Diffusion Model

Video Generation

Zheng-Hui Huang, Zhixiang Wang, Jiaming Tan, et al.

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Xinlei Yu, Zhangquan Chen, Yongbo He, et al.

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Hao Liang, Zhengyang Zhao, Meiyi Qiang, et al.

QuitoBench: A High-Quality Open Time Series Forecasting Benchmark

Siqiao Xue, Zhaoyang Zhu, Wei Zhang, et al.

Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

Code Generation

Zehai He, Wenyi Hong, Zhen Yang, et al.

ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?

Haonan Han, Jiancheng Huang, Xiaopeng Sun, et al.

MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome

Fangda Ye, Yuxin Hu, Pengxiang Zhu, et al.

Terminal Agents Suffice for Enterprise Automation

Patrice Bechard, Orlando Marquez Ayala, Emily Chen, et al.

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Songyang Liu, Chaozhuo Li, Chenxu Wang, et al.

Cheap Bootstrap for Fast Uncertainty Quantification of Stochastic Gradient Descent

Henry Lam, Zitong Wang

Generative AI Enables Structural Brain Network Construction from fMRI via Symmetric Diffusion Learning

Diffusion Model

Medical Imaging

Qiankun Zuo, Bangjun Lei, Wanyu Qiu, et al.

Early Exiting Predictive Coding Neural Networks for Edge AI

Image Classification

Alaa Zniber, Mounir Ghogho, Ouassim Karrakchou, et al.

Quadratic Gradient: A Unified Framework Bridging Gradient Descent and Newton-Type Methods by Synthesizing Hessians and Gradients

The capacity region of classes of product broadcast channels

Yanlin Geng, Amin Gohari, Chandra Nair, et al.

Colon-Bench: An Agentic Workflow for Scalable Dense Lesion Annotation in Full-Procedure Colonoscopy Videos

Medical Imaging

Visual Question Answering

Abdullah Hamdi, Changchun Yang, Xin Gao

TOOLACE: WINNING THE POINTS OF LLM FUNCTION CALLING

Supervised Fine-Tuning

Weiwen Liu, Xu Huang, Xingshan Zeng, et al.

LightMover: Generative Light Movement with Color and Intensity Controls

Diffusion Model

Gengze Zhou, Tianyu Wang, Soo Ye Kim, et al.

Autonomous overtaking trajectory optimization using reinforcement learning and opponent pose estimation

Autonomous Driving

Reinforcement Learning

Matej Rene Cihlar, Luka Šiktar, Branimir Ćaran, et al.

Make It Up: Fake Images, Real Gains in Generalized Few-shot Semantic Segmentation

Diffusion Model

Semantic Segmentation

Guohuan Xie, Xin He, Dingying Fan, et al.

Two-Stage Acoustic Adaptation with Gated Cross-Attention Adapters for LLM-Based Multi-Talker Speech Recognition

Audio Recognition

Hao Shi, Yuan Gao, Xugang Lu, et al.

A Comparative Study in Surgical AI: Datasets, Foundation Models, and Barriers to Med-AGI

Medical Imaging

Kirill Skobelev, Eric Fithian, Yegor Baranovski, et al.

Text Data Integration

Natural Language Processing

Md Ataur Rahman, Dimitris Sacharidis, Oscar Romero, et al.

Unified Number-Free Text-to-Motion Generation Via Flow Matching

Diffusion Model

Guanhe Huang, Oya Celiktutan

AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networks

AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networks

Prince Zizhuang Wang, Shuli Jiang

InCoder-32B-Thinking: Industrial Code World Model for Thinking

InCoder-32B-Thinking: Industrial Code World Model for Thinking

Jian Yang, Wei Zhang, Jiajun Wu, et al.

Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?

Qianshan Wei, Yishan Yang, Siyi Wang, et al.

Token Warping Helps MLLMs Look from Nearby Viewpoints

Multimodal Representation

Phillip Y. Lee, Chanho Park, Mingue Park, et al.

Self-Distilled RLVR

Reinforcement Learning

Chenxu Yang, Chuanyu Qin, Qingyi Si, et al.

A Simple Baseline for Streaming Video Understanding

Video Understanding

Visual Question Answering

Yujiao Shen, Shulin Tian, Jingkang Yang, et al.

CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Ao Qu, Han Zheng, Zijian Zhou, et al.

Steerable Visual Representations

Multimodal Representation

Jona Ruthardt, Manu Gaur, Deva Ramanan, et al.

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Reinforcement Learning

Zhengxi Lu, Zhiyuan Yao, Jinyang Wu, et al.

Generative World Renderer

Diffusion Model

Video Generation

Zheng-Hui Huang, Zhixiang Wang, Jiaming Tan, et al.

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Xinlei Yu, Zhangquan Chen, Yongbo He, et al.

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Hao Liang, Zhengyang Zhao, Meiyi Qiang, et al.

QuitoBench: A High-Quality Open Time Series Forecasting Benchmark

Siqiao Xue, Zhaoyang Zhu, Wei Zhang, et al.

Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

Code Generation

Zehai He, Wenyi Hong, Zhen Yang, et al.

ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?

Haonan Han, Jiancheng Huang, Xiaopeng Sun, et al.

MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome

Fangda Ye, Yuxin Hu, Pengxiang Zhu, et al.

Terminal Agents Suffice for Enterprise Automation

Patrice Bechard, Orlando Marquez Ayala, Emily Chen, et al.

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Songyang Liu, Chaozhuo Li, Chenxu Wang, et al.

Cheap Bootstrap for Fast Uncertainty Quantification of Stochastic Gradient Descent

Henry Lam, Zitong Wang

Generative AI Enables Structural Brain Network Construction from fMRI via Symmetric Diffusion Learning

Diffusion Model

Medical Imaging

Qiankun Zuo, Bangjun Lei, Wanyu Qiu, et al.

Early Exiting Predictive Coding Neural Networks for Edge AI

Image Classification

Alaa Zniber, Mounir Ghogho, Ouassim Karrakchou, et al.

Quadratic Gradient: A Unified Framework Bridging Gradient Descent and Newton-Type Methods by Synthesizing Hessians and Gradients

The capacity region of classes of product broadcast channels

Yanlin Geng, Amin Gohari, Chandra Nair, et al.

Colon-Bench: An Agentic Workflow for Scalable Dense Lesion Annotation in Full-Procedure Colonoscopy Videos

Medical Imaging

Visual Question Answering

Abdullah Hamdi, Changchun Yang, Xin Gao

TOOLACE: WINNING THE POINTS OF LLM FUNCTION CALLING

Supervised Fine-Tuning

Weiwen Liu, Xu Huang, Xingshan Zeng, et al.

LightMover: Generative Light Movement with Color and Intensity Controls

Diffusion Model

Gengze Zhou, Tianyu Wang, Soo Ye Kim, et al.

Autonomous overtaking trajectory optimization using reinforcement learning and opponent pose estimation

Autonomous Driving

Reinforcement Learning

Matej Rene Cihlar, Luka Šiktar, Branimir Ćaran, et al.

Make It Up: Fake Images, Real Gains in Generalized Few-shot Semantic Segmentation

Diffusion Model

Semantic Segmentation

Guohuan Xie, Xin He, Dingying Fan, et al.

Two-Stage Acoustic Adaptation with Gated Cross-Attention Adapters for LLM-Based Multi-Talker Speech Recognition

Audio Recognition

Hao Shi, Yuan Gao, Xugang Lu, et al.

A Comparative Study in Surgical AI: Datasets, Foundation Models, and Barriers to Med-AGI

Medical Imaging

Kirill Skobelev, Eric Fithian, Yegor Baranovski, et al.

Text Data Integration

Natural Language Processing

Md Ataur Rahman, Dimitris Sacharidis, Oscar Romero, et al.

Unified Number-Free Text-to-Motion Generation Via Flow Matching

Diffusion Model

Guanhe Huang, Oya Celiktutan

Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?

Token Warping Helps MLLMs Look from Nearby Viewpoints

Self-Distilled RLVR

A Simple Baseline for Streaming Video Understanding

CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Steerable Visual Representations

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Generative World Renderer

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

QuitoBench: A High-Quality Open Time Series Forecasting Benchmark

Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?

MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome

Terminal Agents Suffice for Enterprise Automation

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Cheap Bootstrap for Fast Uncertainty Quantification of Stochastic Gradient Descent

Generative AI Enables Structural Brain Network Construction from fMRI via Symmetric Diffusion Learning

Early Exiting Predictive Coding Neural Networks for Edge AI

Quadratic Gradient: A Unified Framework Bridging Gradient Descent and Newton-Type Methods by Synthesizing Hessians and Gradients

The capacity region of classes of product broadcast channels

Colon-Bench: An Agentic Workflow for Scalable Dense Lesion Annotation in Full-Procedure Colonoscopy Videos

TOOLACE: WINNING THE POINTS OF LLM FUNCTION CALLING

LightMover: Generative Light Movement with Color and Intensity Controls

Autonomous overtaking trajectory optimization using reinforcement learning and opponent pose estimation

Make It Up: Fake Images, Real Gains in Generalized Few-shot Semantic Segmentation

Two-Stage Acoustic Adaptation with Gated Cross-Attention Adapters for LLM-Based Multi-Talker Speech Recognition

A Comparative Study in Surgical AI: Datasets, Foundation Models, and Barriers to Med-AGI

Text Data Integration

Unified Number-Free Text-to-Motion Generation Via Flow Matching

Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?

Token Warping Helps MLLMs Look from Nearby Viewpoints

Self-Distilled RLVR

A Simple Baseline for Streaming Video Understanding

CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Steerable Visual Representations

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Generative World Renderer

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

QuitoBench: A High-Quality Open Time Series Forecasting Benchmark

Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?

MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome

Terminal Agents Suffice for Enterprise Automation

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Cheap Bootstrap for Fast Uncertainty Quantification of Stochastic Gradient Descent

Generative AI Enables Structural Brain Network Construction from fMRI via Symmetric Diffusion Learning

Early Exiting Predictive Coding Neural Networks for Edge AI

Quadratic Gradient: A Unified Framework Bridging Gradient Descent and Newton-Type Methods by Synthesizing Hessians and Gradients

The capacity region of classes of product broadcast channels

Colon-Bench: An Agentic Workflow for Scalable Dense Lesion Annotation in Full-Procedure Colonoscopy Videos

TOOLACE: WINNING THE POINTS OF LLM FUNCTION CALLING

LightMover: Generative Light Movement with Color and Intensity Controls

Autonomous overtaking trajectory optimization using reinforcement learning and opponent pose estimation

Make It Up: Fake Images, Real Gains in Generalized Few-shot Semantic Segmentation

Two-Stage Acoustic Adaptation with Gated Cross-Attention Adapters for LLM-Based Multi-Talker Speech Recognition

A Comparative Study in Surgical AI: Datasets, Foundation Models, and Barriers to Med-AGI

Text Data Integration

Unified Number-Free Text-to-Motion Generation Via Flow Matching