HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves
Reasoning Efficiency

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency

Wang, Chenlong, Feng, et al.

Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning

Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning

Video Understanding

Shulin Tian, Ruiqi Wang, Hongming Guo, et al.

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Mingxuan Du, Benfeng Xu, Chiwei Zhu, et al.

Scientists' First Exam: Probing Cognitive Abilities of MLLM via
Perception, Understanding, and Reasoning

Zhou, Yuhao, Wang, et al.

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning
Attention

MiniMax, Aili Chen, Aonian Li, et al.

Polystyrene nanoplastics disrupt the intestinal microenvironment by altering bacteria-host interactions through extracellular vesicle-delivered microRNAs

Molecular Network

Wei-Hsuan Hsu, You-Zuo Chen, Yi-Ting Chiang, et al.

Beyond Homogeneous Attention: Memory-Efficient LLMs via
Fourier-Approximated KV Cache

Xiaoran Liu, Siyang He, Qiqi Wang, et al.

A High-Quality Dataset and Reliable Evaluation for Interleaved
Image-Text Generation

Yukang Feng, Jianwen Sun, Chuanhao Li, et al.

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement
Learning for LLM Reasoning

Reinforcement Learning

Liang, Xiao, Li, et al.

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive
Programming?

Code Generation

Zihan Zheng, Zerui Cheng, Zeyu Shen, et al.

The Diffusion Duality

Diffusion Model

Natural Language Processing

Sahoo, Subham Sekhar, Deschenaux, et al.

Effective Red-Teaming of Policy-Adherent Agents

Itay Nakash, George Kour, Koren Lazar, et al.

Aligned Novel View Image and Geometry Synthesis via Cross-modal
Attention Instillation

Image Inpainting

Min-Seop Kwak, Junho Kim, Sangdoo Yun, et al.

Unified differentiable learning of electric response

Neural Networks

Stefano Falletta, Andrea Cepellotti, Anders Johansson, et al.

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

Video Understanding

Yu, Jiashuo, Wu, et al.

AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven
Clip Generation

Video Generation

Shi, Haoyuan, Li, et al.

Text-Aware Image Restoration with Diffusion Models

Diffusion Model

Image Inpainting

Jaewon Min, Jin Hyeon Kim, Paul Hyunbin Cho, et al.

Magistral

Reinforcement Learning

Mistral-AI, Abhinav Rastogi, Albert Q. Jiang, et al.

SWE-Factory: Your Automated Factory for Issue Resolution Training Data
and Evaluation Benchmarks

Code Generation

Lianghong Guo, Yanlin Wang, Caihua Li, et al.

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical
Reasoning

Yu Sun, Xingyu Qian, Weiwen Xu, et al.

Sapiens: Foundation for Human Vision Models

Computer Vision

Multi-Task Learning

Rawal Khirodkar, Timur Bagautdinov, Julieta Martinez, et al.

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Fuzhao Xue, Yukang Chen, Dacheng Li, et al.

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for
Reinforcement Learning and Monte-Carlo Tree Search

Reinforcement Learning

Huajian Xin, Z. Z. Ren, Junxiao Song, et al.

LLaVA-OneVision: Easy Visual Task Transfer

Video Understanding

Bo Li, Yuanhan Zhang, Dong Guo, et al.

SAM 2: Segment Anything in Images and Videos

Computer Vision

Video Understanding

Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, et al.

The Llama 3 Herd of Models

Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, et al.

InternLM-XComposer-2.5: A Versatile Large Vision Language Model
Supporting Long-Contextual Input and Output

Multimodal Representation

Pan Zhang, Xiaoyi Dong, Yuhang Zang, et al.

MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and
Instruction-Tuning Dataset for LVLMs

Ziyu Liu, Tao Chu, Yuhang Zang, et al.

What matters when building vision-language models?

Hugo Laurençon, Léo Tronchon, Matthieu Cord, et al.

DDOS: The Drone Depth and Obstacle Segmentation Dataset

Depth Estimation

Semantic Segmentation

Benedikt Kolbeinsson, Krystian Mikolajczyk

Deep learning-based framework for the on-demand inverse design of metamaterials with arbitrary target band gap

Convolutional Neural Network

Than V. Tran, S. S. Nanthakumar, Xiaoying Zhuang

PRefLexOR: preference-based recursive language modeling for exploratory optimization of reasoning and agentic thinking

Preference Modeling

Reinforcement Learning

Markus J. Buehler

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves
Reasoning Efficiency

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency

Wang, Chenlong, Feng, et al.

Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning

Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning

Video Understanding

Shulin Tian, Ruiqi Wang, Hongming Guo, et al.

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Mingxuan Du, Benfeng Xu, Chiwei Zhu, et al.

Scientists' First Exam: Probing Cognitive Abilities of MLLM via
Perception, Understanding, and Reasoning

Zhou, Yuhao, Wang, et al.

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning
Attention

MiniMax, Aili Chen, Aonian Li, et al.

Polystyrene nanoplastics disrupt the intestinal microenvironment by altering bacteria-host interactions through extracellular vesicle-delivered microRNAs

Molecular Network

Wei-Hsuan Hsu, You-Zuo Chen, Yi-Ting Chiang, et al.

Beyond Homogeneous Attention: Memory-Efficient LLMs via
Fourier-Approximated KV Cache

Xiaoran Liu, Siyang He, Qiqi Wang, et al.

A High-Quality Dataset and Reliable Evaluation for Interleaved
Image-Text Generation

Yukang Feng, Jianwen Sun, Chuanhao Li, et al.

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement
Learning for LLM Reasoning

Reinforcement Learning

Liang, Xiao, Li, et al.

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive
Programming?

Code Generation

Zihan Zheng, Zerui Cheng, Zeyu Shen, et al.

The Diffusion Duality

Diffusion Model

Natural Language Processing

Sahoo, Subham Sekhar, Deschenaux, et al.

Effective Red-Teaming of Policy-Adherent Agents

Itay Nakash, George Kour, Koren Lazar, et al.

Aligned Novel View Image and Geometry Synthesis via Cross-modal
Attention Instillation

Image Inpainting

Min-Seop Kwak, Junho Kim, Sangdoo Yun, et al.

Unified differentiable learning of electric response

Neural Networks

Stefano Falletta, Andrea Cepellotti, Anders Johansson, et al.

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

Video Understanding

Yu, Jiashuo, Wu, et al.

AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven
Clip Generation

Video Generation

Shi, Haoyuan, Li, et al.

Text-Aware Image Restoration with Diffusion Models

Diffusion Model

Image Inpainting

Jaewon Min, Jin Hyeon Kim, Paul Hyunbin Cho, et al.

Magistral

Reinforcement Learning

Mistral-AI, Abhinav Rastogi, Albert Q. Jiang, et al.

SWE-Factory: Your Automated Factory for Issue Resolution Training Data
and Evaluation Benchmarks

Code Generation

Lianghong Guo, Yanlin Wang, Caihua Li, et al.

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical
Reasoning

Yu Sun, Xingyu Qian, Weiwen Xu, et al.

Sapiens: Foundation for Human Vision Models

Computer Vision

Multi-Task Learning

Rawal Khirodkar, Timur Bagautdinov, Julieta Martinez, et al.

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Fuzhao Xue, Yukang Chen, Dacheng Li, et al.

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for
Reinforcement Learning and Monte-Carlo Tree Search

Reinforcement Learning

Huajian Xin, Z. Z. Ren, Junxiao Song, et al.

LLaVA-OneVision: Easy Visual Task Transfer

Video Understanding

Bo Li, Yuanhan Zhang, Dong Guo, et al.

SAM 2: Segment Anything in Images and Videos

Computer Vision

Video Understanding

Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, et al.

The Llama 3 Herd of Models

Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, et al.

InternLM-XComposer-2.5: A Versatile Large Vision Language Model
Supporting Long-Contextual Input and Output

Multimodal Representation

Pan Zhang, Xiaoyi Dong, Yuhang Zang, et al.

MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and
Instruction-Tuning Dataset for LVLMs

Ziyu Liu, Tao Chu, Yuhang Zang, et al.

What matters when building vision-language models?

Hugo Laurençon, Léo Tronchon, Matthieu Cord, et al.

DDOS: The Drone Depth and Obstacle Segmentation Dataset

Depth Estimation

Semantic Segmentation

Benedikt Kolbeinsson, Krystian Mikolajczyk

Deep learning-based framework for the on-demand inverse design of metamaterials with arbitrary target band gap

Convolutional Neural Network

Than V. Tran, S. S. Nanthakumar, Xiaoying Zhuang

PRefLexOR: preference-based recursive language modeling for exploratory optimization of reasoning and agentic thinking

Preference Modeling

Reinforcement Learning

Markus J. Buehler

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Polystyrene nanoplastics disrupt the intestinal microenvironment by altering bacteria-host interactions through extracellular vesicle-delivered microRNAs

Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache

A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?

The Diffusion Duality

Effective Red-Teaming of Policy-Adherent Agents

Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation

Unified differentiable learning of electric response

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Text-Aware Image Restoration with Diffusion Models

Magistral

SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Sapiens: Foundation for Human Vision Models

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

LLaVA-OneVision: Easy Visual Task Transfer

SAM 2: Segment Anything in Images and Videos

The Llama 3 Herd of Models

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs

What matters when building vision-language models?

DDOS: The Drone Depth and Obstacle Segmentation Dataset

Deep learning-based framework for the on-demand inverse design of metamaterials with arbitrary target band gap

PRefLexOR: preference-based recursive language modeling for exploratory optimization of reasoning and agentic thinking

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Polystyrene nanoplastics disrupt the intestinal microenvironment by altering bacteria-host interactions through extracellular vesicle-delivered microRNAs

Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache

A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?

The Diffusion Duality

Effective Red-Teaming of Policy-Adherent Agents

Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation

Unified differentiable learning of electric response

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Text-Aware Image Restoration with Diffusion Models

Magistral

SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Sapiens: Foundation for Human Vision Models

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

LLaVA-OneVision: Easy Visual Task Transfer

SAM 2: Segment Anything in Images and Videos

The Llama 3 Herd of Models

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs

What matters when building vision-language models?

DDOS: The Drone Depth and Obstacle Segmentation Dataset

Deep learning-based framework for the on-demand inverse design of metamaterials with arbitrary target band gap

PRefLexOR: preference-based recursive language modeling for exploratory optimization of reasoning and agentic thinking