6 months ago

Hang Yan Fangzhi Xu Rongman Xu Yifei Li Jian Zhang Haoran Luo Xiaobao Wu Luu Anh Tuan Haiteng Zhao Qika Lin

Abstract

Large Language Models (LLMs) have achieved impressive performance on reasoning-intensive tasks, yet optimizing their reasoning efficiency remains an open challenge. While Test-Time Scaling (TTS) improves reasoning quality, it often leads to overthinking, wasting tokens on redundant computations. This work investigates how to efficiently and adaptively guide LLM test-time scaling without additional training. Inspired by the concept of momentum in physics, we propose Momentum Uncertainty-guided Reasoning (MUR), which dynamically allocates thinking budgets to critical reasoning steps by tracking and aggregating stepwise uncertainty over time. To support flexible inference-time control, we introduce gamma-control, a simple mechanism that tunes the reasoning budget via a single hyperparameter. We provide in-depth theoretical proof to support the superiority of MUR in terms of stability and biases. MUR is comprehensively evaluated against various TTS methods across four challenging benchmarks (MATH-500, AIME24, AIME25, and GPQA-diamond) using different sizes of recent Qwen3 models (1.7B, 4B, and 8B). Results demonstrate that MUR reduces computation by over 50% on average while improving accuracy by 0.62-3.37%.

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

6 months ago

LLM

Reasoning

Supervised Fine-Tuning

Method/Architecture

Hang Yan Fangzhi Xu Rongman Xu Yifei Li Jian Zhang Haoran Luo Xiaobao Wu Luu Anh Tuan Haiteng Zhao Qika Lin

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

6 months ago

LLM

Reasoning

Supervised Fine-Tuning

Method/Architecture

Hang Yan Fangzhi Xu Rongman Xu Yifei Li Jian Zhang Haoran Luo Xiaobao Wu Luu Anh Tuan Haiteng Zhao Qika Lin

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

MUR: Momentum Uncertainty guided Reasoning for Large Language Models

Hang Yan Fangzhi Xu Rongman Xu Yifei Li Jian Zhang Haoran Luo Xiaobao Wu Luu Anh Tuan Haiteng Zhao Qika Lin1 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

MUR: Momentum Uncertainty guided Reasoning for Large Language Models

Hang Yan Fangzhi Xu Rongman Xu Yifei Li Jian Zhang Haoran Luo Xiaobao Wu Luu Anh Tuan Haiteng Zhao Qika Lin1 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

MUR: Momentum Uncertainty guided Reasoning for Large Language Models

Hang Yan Fangzhi Xu Rongman Xu Yifei Li Jian Zhang Haoran Luo Xiaobao Wu Luu Anh Tuan Haiteng Zhao Qika Lin1 more

Abstract

Build AI with AI

HyperAI Newsletters

Hang Yan Fangzhi Xu Rongman Xu Yifei Li Jian Zhang Haoran Luo Xiaobao Wu Luu Anh Tuan Haiteng Zhao Qika Lin

Hang Yan Fangzhi Xu Rongman Xu Yifei Li Jian Zhang Haoran Luo Xiaobao Wu Luu Anh Tuan Haiteng Zhao Qika Lin

Hang Yan Fangzhi Xu Rongman Xu Yifei Li Jian Zhang Haoran Luo Xiaobao Wu Luu Anh Tuan Haiteng Zhao Qika Lin