HyperAIHyperAI

Command Palette

Search for a command to run...

2 months ago

WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

Abstract

The paradigm of Large Language Models (LLMs) has increasingly shifted towardagentic applications, where web browsing capabilities are fundamental forretrieving information from diverse online sources. However, existingopen-source web agents either demonstrate limited information-seeking abilitieson complex tasks or lack transparent implementations. In this work, we identifythat the key challenge lies in the scarcity of challenging data for informationseeking. To address this limitation, we introduce WebExplorer: a systematicdata generation approach using model-based exploration and iterative,long-to-short query evolution. This method creates challenging query-answerpairs that require multi-step reasoning and complex web navigation. Byleveraging our curated high-quality dataset, we successfully develop advancedweb agent WebExplorer-8B through supervised fine-tuning followed byreinforcement learning. Our model supports 128K context length and up to 100tool calling turns, enabling long-horizon problem solving. Across diverseinformation-seeking benchmarks, WebExplorer-8B achieves the state-of-the-artperformance at its scale. Notably, as an 8B-sized model, WebExplorer-8B is ableto effectively search over an average of 16 turns after RL training, achievinghigher accuracy than WebSailor-72B on BrowseComp-en/zh and attaining the bestperformance among models up to 100B parameters on WebWalkerQA and FRAMES.Beyond these information-seeking tasks, our model also achieves stronggeneralization on the HLE benchmark even though it is only trained onknowledge-intensive QA data. These results highlight our approach as apractical path toward long-horizon web agents.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp