HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

WebSailor: Navigating Super-human Reasoning for Web Agent

WebSailor: Navigating Super-human Reasoning for Web Agent

Abstract

Transcending human cognitive limitations represents a critical frontier inLLM training. Proprietary agentic systems like DeepResearch have demonstratedsuperhuman capabilities on extremely complex information-seeking benchmarkssuch as BrowseComp, a feat previously unattainable. We posit that their successhinges on a sophisticated reasoning pattern absent in open-source models: theability to systematically reduce extreme uncertainty when navigating vastinformation landscapes. Based on this insight, we introduce WebSailor, acomplete post-training methodology designed to instill this crucial capability.Our approach involves generating novel, high-uncertainty tasks throughstructured sampling and information obfuscation, RFT cold start, and anefficient agentic RL training algorithm, Duplicating Sampling PolicyOptimization (DUPO). With this integrated pipeline, WebSailor significantlyoutperforms all opensource agents in complex information-seeking tasks,matching proprietary agents' performance and closing the capability gap.

Code Repositories

alibaba-nlp/webwalker
Mentioned in GitHub
SnailDev/github-hot-hub
pytorch
Mentioned in GitHub
alibaba-nlp/webagent
Official
Mentioned in GitHub
kun-g/Scraping-Github-trending
tf
Mentioned in GitHub

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
WebSailor: Navigating Super-human Reasoning for Web Agent | Papers | HyperAI