HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition

Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition

Diffusion Model

Jiahang Cao, Yize Huang, Hanzhong Guo, et al.

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Preference Modeling

ShengYun Peng, Eric Smith, Ivan Evtimov, et al.

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Zichen Wen, Shaobo Wang, Yufa Zhou, et al.

Apriel-1.5-15b-Thinker

Visual Question Answering

Shruthan Radhakrishna, Aman Tiwari, Aanjaneya Shukla, et al.

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world
Markets?

Yanxu Chen, Zijun Yao, Yantao Liu, et al.

Interactive Training: Feedback-Driven Neural Network Optimization

Human-Computer Interaction

Wentao Zhang, Yang Young Lu, Yuntian Deng

StealthAttack: Robust 3D Gaussian Splatting Poisoning via Density-Guided
Illusions

3D Machine Vision

Bo-Hsu Ke, You-Zhe Xie, Yu-Lun Liu, et al.

ExGRPO: Learning to Reason from Experience

Reinforcement Learning

Runzhe Zhan, Yafu Li, Zhi Wang, et al.

Self-Forcing++: Towards Minute-Scale High-Quality Video Generation

Diffusion Model

Video Generation

Justin Cui, Jie Wu, Ming Li, et al.

LongCodeZip: Compress Long Context for Code Language Models

Code Generation

Yuling Shi, Yichun Qian, Hongyu Zhang, et al.

PIPer: On-Device Environment Setup via Online Reinforcement Learning

Reinforcement Learning

Supervised Fine-Tuning

Alexander Kovrigin, Aleksandra Eliseeva, Konstantin Grotov, et al.

Rethinking Reward Models for Multi-Domain Test-Time Scaling

Supervised Fine-Tuning

Dong Bok Lee, Seanie Lee, Sangwoo Park, et al.

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Reinforcement Learning

Ziniu Li, Congliang Chen, Tianyun Yang, et al.

GEM: A Gym for Agentic LLMs

Reinforcement Learning

Zichen Liu, Anya Sims, Keyu Duan, et al.

VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified
Rewards in World Simulators

Reinforcement Learning

Embodied Intelligence

Hengtao Li, Pengxiang Ding, Runze Suo, et al.

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Reinforcement Learning

Fang Wu, Weihao Xuan, Heli Qi, et al.

OceanGym: A Benchmark Environment for Underwater Embodied Agents

Embodied Intelligence

Yida Xue, Mingjun Mao, Xiangyuan Ru, et al.

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Reinforcement Learning

Supervised Fine-Tuning

Zhepei Wei, Xiao Yang, Kai Sun, et al.

Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

Supervised Fine-Tuning

Shaobo Wang, Jiaming Wang, Jiajun Zhang, et al.

The Dragon Hatchling: The Missing Link between the Transformer and
Models of the Brain

Natural Language Processing

Adrian Kosowski, Przemysław Uznański, Jan Chorowski, et al.

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified
Self-Play

Visual Question Answering

Qinsi Wang, Bo Liu, Tianyi Zhou, et al.

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Zijian Wu, Xiangyan Liu, Xinyuan Zhang, et al.

Random Policy Valuation is Enough for LLM Reasoning with Verifiable
Rewards

Reinforcement Learning

Haoran He, Yuxiao Ye, Qingpeng Cai, et al.

Democratizing AI scientists using ToolUniverse

Shanghua Gao, Richard Zhu, Pengwei Sui, et al.

When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance

Supervised Fine-Tuning

Nicolas Boizard, Hippolyte Gisserot-Boukhlef, Kevin El-Haddad, et al.

Multiplayer Nash Preference Optimization

Preference Modeling

Reinforcement Learning

Fang Wu, Xu Huang, Weihao Xuan, et al.

StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs

Audio and Speech Processing

Yuhan Song, Linhao Zhang, Chuhan Wu, et al.

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable
Sparse-Linear Attention

Diffusion Model

Jintao Zhang, Haoxu Wang, Kai Jiang, et al.

SimpleFold: Folding Proteins is Simpler than You Think

Yuyang Wang, Jiarui Lu, Navdeep Jaitly, et al.

POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion

Document Understanding

Yuan Liu, Zhongyin Zhao, Le Tian, et al.

Generalizable Geometric Image Caption Synthesis

Image Captioning

Yue Xin, Wenyuan Wang, Rui Pan, et al.

Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

Reinforcement Learning

Supervised Fine-Tuning

Siwei Wang, Yifei Shen, Haoran Sun, et al.

Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition

Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition

Diffusion Model

Jiahang Cao, Yize Huang, Hanzhong Guo, et al.

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Preference Modeling

ShengYun Peng, Eric Smith, Ivan Evtimov, et al.

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Zichen Wen, Shaobo Wang, Yufa Zhou, et al.

Apriel-1.5-15b-Thinker

Visual Question Answering

Shruthan Radhakrishna, Aman Tiwari, Aanjaneya Shukla, et al.

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world
Markets?

Yanxu Chen, Zijun Yao, Yantao Liu, et al.

Interactive Training: Feedback-Driven Neural Network Optimization

Human-Computer Interaction

Wentao Zhang, Yang Young Lu, Yuntian Deng

StealthAttack: Robust 3D Gaussian Splatting Poisoning via Density-Guided
Illusions

3D Machine Vision

Bo-Hsu Ke, You-Zhe Xie, Yu-Lun Liu, et al.

ExGRPO: Learning to Reason from Experience

Reinforcement Learning

Runzhe Zhan, Yafu Li, Zhi Wang, et al.

Self-Forcing++: Towards Minute-Scale High-Quality Video Generation

Diffusion Model

Video Generation

Justin Cui, Jie Wu, Ming Li, et al.

LongCodeZip: Compress Long Context for Code Language Models

Code Generation

Yuling Shi, Yichun Qian, Hongyu Zhang, et al.

PIPer: On-Device Environment Setup via Online Reinforcement Learning

Reinforcement Learning

Supervised Fine-Tuning

Alexander Kovrigin, Aleksandra Eliseeva, Konstantin Grotov, et al.

Rethinking Reward Models for Multi-Domain Test-Time Scaling

Supervised Fine-Tuning

Dong Bok Lee, Seanie Lee, Sangwoo Park, et al.

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Reinforcement Learning

Ziniu Li, Congliang Chen, Tianyun Yang, et al.

GEM: A Gym for Agentic LLMs

Reinforcement Learning

Zichen Liu, Anya Sims, Keyu Duan, et al.

VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified
Rewards in World Simulators

Reinforcement Learning

Embodied Intelligence

Hengtao Li, Pengxiang Ding, Runze Suo, et al.

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Reinforcement Learning

Fang Wu, Weihao Xuan, Heli Qi, et al.

OceanGym: A Benchmark Environment for Underwater Embodied Agents

Embodied Intelligence

Yida Xue, Mingjun Mao, Xiangyuan Ru, et al.

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Reinforcement Learning

Supervised Fine-Tuning

Zhepei Wei, Xiao Yang, Kai Sun, et al.

Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

Supervised Fine-Tuning

Shaobo Wang, Jiaming Wang, Jiajun Zhang, et al.

The Dragon Hatchling: The Missing Link between the Transformer and
Models of the Brain

Natural Language Processing

Adrian Kosowski, Przemysław Uznański, Jan Chorowski, et al.

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified
Self-Play

Visual Question Answering

Qinsi Wang, Bo Liu, Tianyi Zhou, et al.

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Zijian Wu, Xiangyan Liu, Xinyuan Zhang, et al.

Random Policy Valuation is Enough for LLM Reasoning with Verifiable
Rewards

Reinforcement Learning

Haoran He, Yuxiao Ye, Qingpeng Cai, et al.

Democratizing AI scientists using ToolUniverse

Shanghua Gao, Richard Zhu, Pengwei Sui, et al.

When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance

Supervised Fine-Tuning

Nicolas Boizard, Hippolyte Gisserot-Boukhlef, Kevin El-Haddad, et al.

Multiplayer Nash Preference Optimization

Preference Modeling

Reinforcement Learning

Fang Wu, Xu Huang, Weihao Xuan, et al.

StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs

Audio and Speech Processing

Yuhan Song, Linhao Zhang, Chuhan Wu, et al.

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable
Sparse-Linear Attention

Diffusion Model

Jintao Zhang, Haoxu Wang, Kai Jiang, et al.

SimpleFold: Folding Proteins is Simpler than You Think

Yuyang Wang, Jiarui Lu, Navdeep Jaitly, et al.

POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion

Document Understanding

Yuan Liu, Zhongyin Zhao, Le Tian, et al.

Generalizable Geometric Image Caption Synthesis

Image Captioning

Yue Xin, Wenyuan Wang, Rui Pan, et al.

Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

Reinforcement Learning

Supervised Fine-Tuning

Siwei Wang, Yifei Shen, Haoran Sun, et al.

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Apriel-1.5-15b-Thinker

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

Interactive Training: Feedback-Driven Neural Network Optimization

StealthAttack: Robust 3D Gaussian Splatting Poisoning via Density-Guided Illusions

ExGRPO: Learning to Reason from Experience

Self-Forcing++: Towards Minute-Scale High-Quality Video Generation

LongCodeZip: Compress Long Context for Code Language Models

PIPer: On-Device Environment Setup via Online Reinforcement Learning

Rethinking Reward Models for Multi-Domain Test-Time Scaling

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

GEM: A Gym for Agentic LLMs

VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

OceanGym: A Benchmark Environment for Underwater Embodied Agents

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards

Democratizing AI scientists using ToolUniverse

When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance

Multiplayer Nash Preference Optimization

StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

SimpleFold: Folding Proteins is Simpler than You Think

POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion

Generalizable Geometric Image Caption Synthesis

Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Apriel-1.5-15b-Thinker

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

Interactive Training: Feedback-Driven Neural Network Optimization

StealthAttack: Robust 3D Gaussian Splatting Poisoning via Density-Guided Illusions

ExGRPO: Learning to Reason from Experience

Self-Forcing++: Towards Minute-Scale High-Quality Video Generation

LongCodeZip: Compress Long Context for Code Language Models

PIPer: On-Device Environment Setup via Online Reinforcement Learning

Rethinking Reward Models for Multi-Domain Test-Time Scaling

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

GEM: A Gym for Agentic LLMs

VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

OceanGym: A Benchmark Environment for Underwater Embodied Agents

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards

Democratizing AI scientists using ToolUniverse

When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance

Multiplayer Nash Preference Optimization

StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

SimpleFold: Folding Proteins is Simpler than You Think

POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion

Generalizable Geometric Image Caption Synthesis

Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective