5 months ago

Scaling Test-time Compute for LLM Agents

King Zhu Hanhao Li Siwei Wu Tianshun Xing Dehua Ma Xiangru Tang Minghao Liu Jian Yang Jiaheng Liu Yuchen Eleanor Jiang

Abstract

Scaling test time compute has shown remarkable success in improving thereasoning abilities of large language models (LLMs). In this work, we conductthe first systematic exploration of applying test-time scaling methods tolanguage agents and investigate the extent to which it improves theireffectiveness. Specifically, we explore different test-time scaling strategies,including: (1) parallel sampling algorithms; (2) sequential revisionstrategies; (3) verifiers and merging methods; (4)strategies for diversifyingrollouts.We carefully analyze and ablate the impact of different designstrategies on applying test-time scaling on language agents, and have followfindings: 1. Scaling test time compute could improve the performance of agents.2. Knowing when to reflect is important for agents. 3. Among differentverification and result merging approaches, the list-wise method performs best.4. Increasing diversified rollouts exerts a positive effect on the agent's taskperformance.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Scaling Test-time Compute for LLM Agents

King Zhu Hanhao Li Siwei Wu Tianshun Xing Dehua Ma Xiangru Tang Minghao Liu Jian Yang Jiaheng Liu Yuchen Eleanor Jiang5 more

Abstract

Build AI with AI

Hyper Newsletters

King Zhu Hanhao Li Siwei Wu Tianshun Xing Dehua Ma Xiangru Tang Minghao Liu Jian Yang Jiaheng Liu Yuchen Eleanor Jiang