HyperAIHyperAI

Command Palette

Search for a command to run...

UQ Unsolved Questions Dataset

Date

2 months ago

Organization

Stanford University
University of Washington

Paper URL

2508.17580

License

CC BY-SA 4.0

Join the Discord Community

*This dataset supports online use.Click here to jump.

The UQ dataset is an evaluation benchmark released in 2025 by Stanford University, the University of Washington, the University of North Carolina and other institutions. The relevant paper results are "UQ: Assessing Language Models on Unsolved Questions", which aims to evaluate the reasoning, factuality and browsing capabilities of cutting-edge large models by using real and difficult "problems that have not been answered by human society".

The dataset consists of 500 long-standing unanswered questions from the Stack Exchange platform, covering topics such as computer science theory, mathematics, science fiction, and history. It adopts a "rule filtering + LLM review + manual review" collection pipeline, and is equipped with UQ-Validators for automatic pre-screening and community review of candidate answers. Its characteristics are difficult but realistic, asynchronous evaluation, and generation-verification separation. It is suitable for scenarios such as reasoning/retrieval evaluation of cutting-edge models, long-term progress tracking, and public rankings.

Data distribution:

  • Science: 395
  • Technology: 52
  • Culture & Recreation: 16
  • Life & Arts: 35
Dataset construction process

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
UQ Unsolved Questions Dataset | Datasets | HyperAI