HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges
in Russian Speech Generative Models

A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models

Kirill Borodin, Nikita Vasiliev, Vasiliy Kudryavtsev, et al.

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

Diffusion Model

Supervised Fine-Tuning

Zichen Wen, Jiashu Qu, Dongrui Liu, et al.

PrefPalette: Personalized Preference Modeling with Latent Attributes

Preference Modeling

Natural Language Processing

Shuyue Stella Li, Melanie Sclar, Hunter Lang, et al.

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning

Xiaoya Li, Xiaofei Sun, Albert Wang, et al.

AnyCap Project: A Unified Framework, Dataset, and Benchmark for
Controllable Omni-modal Captioning

Yiming Ren, Zhiqiang Lin, Yu Li, et al.

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos
with Spatio-Temporal Diffusion Models

Diffusion Model

Yudong Jin, Sida Peng, Xuan Wang, et al.

The Imitation Game: Turing Machine Imitator is Length Generalizable
Reasoner

Zhouqi Hua, Wenwei Zhang, Chengqi Lyu, et al.

π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Depth Estimation

3D Machine Vision

Yifan Wang, Jianjun Zhou, Haoyi Zhu, et al.

VisionThink: Smart and Efficient Vision Language Model via Reinforcement
Learning

Visual Question Answering

Senqiao Yang, Junyi Li, Xin Lai, et al.

A Survey of Context Engineering for Large Language Models

Retrieval-Augmented Generation

Lingrui Mei, Jiayu Yao, Yuyao Ge, et al.

Assessing adaptive world models in machines with novel games

Lance Ying, Katherine M. Collins, Prafull Sharma, et al.

Emotional Support with LLM-based Empathetic Dialogue Generation

Supervised Fine-Tuning

Shiquan Wang, Ruiyu Fang, Zhongjiang He, et al.

DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering

Yinsheng Li, Zhen Dong, Yi Shao

SWE-Perf: Can Language Models Optimize Code Performance on Real-World
Repositories?

Xinyi He, Qian Liu, Mingzhe Du, et al.

MOSPA: Human Motion Generation Driven by Spatial Audio

Shuyang Xu, Zhiyang Dou, Mingyi Shi, et al.

MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior
Understanding

Visual Question Answering

Renjie Li, Ruijie Ye, Mingyang Wu, et al.

PhysX: Physical-Grounded 3D Asset Generation

Ziang Cao, Zhaoxi Chen, Linag Pan, et al.

Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs

Retrieval-Augmented Generation

Yangning Li, Weizhi Zhang, Yuyao Yang, et al.

La-Proteina: Atomistic Protein Generation via Partially Latent Flow Matching

Tomas Geffner, Kieran Didi, Zhonglin Cao, et al.

SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics

Qingtian Zhu, Yumin Zheng, Yuling Sang, et al.

XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledge

Wuxin Wang, Weicheng Ni, Lilan Huang, et al.

AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs

Florian Gr\u00f6tschla, Luis M\u00fcller, Jan T\u00f6nshoff, et al.

Can Multimodal Foundation Models Understand Schematic Diagrams? An
Empirical Study on Information-Seeking QA over Scientific Papers

Visual Question Answering

Document Understanding

Yilun Zhao, Chengye Wang, Chuhan Li, et al.

Scaling Laws for Optimal Data Mixtures

Mustafa Shukor, Louis Bethune, Dan Busbridge, et al.

Subject-Consistent and Pose-Diverse Text-to-Image Generation

Diffusion Model

Zhanxin Gao, Beier Zhu, Liang Yao, et al.

Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models

Image Captioning

Diffusion Model

Tiezheng Zhang, Yitong Li, Yu-cheng Chou, et al.

DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning with Dual-Pathway Global-Local Fusion

Jin Li, Zezhong Ding, Xike Xie

CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking

Yuehao Huang, Liang Liu, Shuangming Lei, et al.

LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers

Jingze Zhu, Yongliang Wu, Wenbo Zhu, et al.

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive
Token-Level Computation

Sangmin Bae, Yujin Kim, Reza Bayat, et al.

REST: Stress Testing Large Reasoning Models by Asking Multiple Problems
at Once

Zhuoshi Pan, Qizhi Pei, Yu Li, et al.

EmbRACE-3K: Embodied Reasoning and Action in Complex Environments

Embodied Intelligence

Mingxian Lin, Wei Huang, Yitang Li, et al.

A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges
in Russian Speech Generative Models

A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models

Kirill Borodin, Nikita Vasiliev, Vasiliy Kudryavtsev, et al.

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

Diffusion Model

Supervised Fine-Tuning

Zichen Wen, Jiashu Qu, Dongrui Liu, et al.

PrefPalette: Personalized Preference Modeling with Latent Attributes

Preference Modeling

Natural Language Processing

Shuyue Stella Li, Melanie Sclar, Hunter Lang, et al.

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning

Xiaoya Li, Xiaofei Sun, Albert Wang, et al.

AnyCap Project: A Unified Framework, Dataset, and Benchmark for
Controllable Omni-modal Captioning

Yiming Ren, Zhiqiang Lin, Yu Li, et al.

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos
with Spatio-Temporal Diffusion Models

Diffusion Model

Yudong Jin, Sida Peng, Xuan Wang, et al.

The Imitation Game: Turing Machine Imitator is Length Generalizable
Reasoner

Zhouqi Hua, Wenwei Zhang, Chengqi Lyu, et al.

π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Depth Estimation

3D Machine Vision

Yifan Wang, Jianjun Zhou, Haoyi Zhu, et al.

VisionThink: Smart and Efficient Vision Language Model via Reinforcement
Learning

Visual Question Answering

Senqiao Yang, Junyi Li, Xin Lai, et al.

A Survey of Context Engineering for Large Language Models

Retrieval-Augmented Generation

Lingrui Mei, Jiayu Yao, Yuyao Ge, et al.

Assessing adaptive world models in machines with novel games

Lance Ying, Katherine M. Collins, Prafull Sharma, et al.

Emotional Support with LLM-based Empathetic Dialogue Generation

Supervised Fine-Tuning

Shiquan Wang, Ruiyu Fang, Zhongjiang He, et al.

DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering

Yinsheng Li, Zhen Dong, Yi Shao

SWE-Perf: Can Language Models Optimize Code Performance on Real-World
Repositories?

Xinyi He, Qian Liu, Mingzhe Du, et al.

MOSPA: Human Motion Generation Driven by Spatial Audio

Shuyang Xu, Zhiyang Dou, Mingyi Shi, et al.

MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior
Understanding

Visual Question Answering

Renjie Li, Ruijie Ye, Mingyang Wu, et al.

PhysX: Physical-Grounded 3D Asset Generation

Ziang Cao, Zhaoxi Chen, Linag Pan, et al.

Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs

Retrieval-Augmented Generation

Yangning Li, Weizhi Zhang, Yuyao Yang, et al.

La-Proteina: Atomistic Protein Generation via Partially Latent Flow Matching

Tomas Geffner, Kieran Didi, Zhonglin Cao, et al.

SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics

Qingtian Zhu, Yumin Zheng, Yuling Sang, et al.

XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledge

Wuxin Wang, Weicheng Ni, Lilan Huang, et al.

AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs

Florian Gr\u00f6tschla, Luis M\u00fcller, Jan T\u00f6nshoff, et al.

Can Multimodal Foundation Models Understand Schematic Diagrams? An
Empirical Study on Information-Seeking QA over Scientific Papers

Visual Question Answering

Document Understanding

Yilun Zhao, Chengye Wang, Chuhan Li, et al.

Scaling Laws for Optimal Data Mixtures

Mustafa Shukor, Louis Bethune, Dan Busbridge, et al.

Subject-Consistent and Pose-Diverse Text-to-Image Generation

Diffusion Model

Zhanxin Gao, Beier Zhu, Liang Yao, et al.

Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models

Image Captioning

Diffusion Model

Tiezheng Zhang, Yitong Li, Yu-cheng Chou, et al.

DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning with Dual-Pathway Global-Local Fusion

Jin Li, Zezhong Ding, Xike Xie

CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking

Yuehao Huang, Liang Liu, Shuangming Lei, et al.

LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers

Jingze Zhu, Yongliang Wu, Wenbo Zhu, et al.

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive
Token-Level Computation

Sangmin Bae, Yujin Kim, Reza Bayat, et al.

REST: Stress Testing Large Reasoning Models by Asking Multiple Problems
at Once

Zhuoshi Pan, Qizhi Pei, Yu Li, et al.

EmbRACE-3K: Embodied Reasoning and Action in Complex Environments

Embodied Intelligence

Mingxian Lin, Wei Huang, Yitang Li, et al.

PrefPalette: Personalized Preference Modeling with Latent Attributes

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning

AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

π^3: Scalable Permutation-Equivariant Visual Geometry Learning

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

A Survey of Context Engineering for Large Language Models

Assessing adaptive world models in machines with novel games

Emotional Support with LLM-based Empathetic Dialogue Generation

DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

MOSPA: Human Motion Generation Driven by Spatial Audio

MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding

PhysX: Physical-Grounded 3D Asset Generation

Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs

La-Proteina: Atomistic Protein Generation via Partially Latent Flow Matching

SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics

XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledge

AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs

Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers

Scaling Laws for Optimal Data Mixtures

Subject-Consistent and Pose-Diverse Text-to-Image Generation

Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models

DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning with Dual-Pathway Global-Local Fusion

CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking

LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once

EmbRACE-3K: Embodied Reasoning and Action in Complex Environments

PrefPalette: Personalized Preference Modeling with Latent Attributes

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning

AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

π^3: Scalable Permutation-Equivariant Visual Geometry Learning

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

A Survey of Context Engineering for Large Language Models

Assessing adaptive world models in machines with novel games

Emotional Support with LLM-based Empathetic Dialogue Generation

DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

MOSPA: Human Motion Generation Driven by Spatial Audio

MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding

PhysX: Physical-Grounded 3D Asset Generation

Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs

La-Proteina: Atomistic Protein Generation via Partially Latent Flow Matching

SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics

XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledge

AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs

Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers

Scaling Laws for Optimal Data Mixtures

Subject-Consistent and Pose-Diverse Text-to-Image Generation

Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models

DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning with Dual-Pathway Global-Local Fusion

CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking

LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once

EmbRACE-3K: Embodied Reasoning and Action in Complex Environments