Command Palette
Search for a command to run...
WebExplorer-QA Information Retrieval Question Answering Dataset
Date
Paper URL
License
Apache 2.0
WebExplorer-QA is a dataset for information retrieval and web browsing tasks released by the Hong Kong University of Science and Technology, MiniMax and the University of Waterloo in 2025. The related paper results are "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents", which aims to improve the model's performance in complex multi-step reasoning and long-range web navigation by systematically generating challenging query-answer pairs.
Currently, only 100 high-quality examples from this dataset are publicly available for academic research and community testing. These data are generated by model exploration to generate initial question-answer pairs, which are then iteratively refined through a "long-to-short" query evolution mechanism to increase the difficulty of the questions and the link between information retrieval and query accuracy. These question-answer pairs require the model to perform multi-step retrieval/browsing operations, aggregating information from multiple web pages to generate answers. These pairs are suitable for training and evaluating network agents or large language models in information seeking, multi-step reasoning, long-horizon context processing, tool calling, and web navigation.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.