2 months ago

Wanghan Xu Yuhao Zhou Yifan Zhou Qinglong Cao Shuo Li Jia Bu Bo Liu Yixin Chen Xuming He Xiangyu Zhao

Abstract

Despite advances in scientific AI, a coherent framework for Scientific General Intelligence (SGI)-the ability to autonomously conceive, investigate, and reason across scientific domains-remains lacking. We present an operational SGI definition grounded in the Practical Inquiry Model (PIM: Deliberation, Conception, Action, Perception) and operationalize it via four scientist-aligned tasks: deep research, idea generation, dry/wet experiments, and experimental reasoning. SGI-Bench comprises over 1,000 expert-curated, cross-disciplinary samples inspired by Science's 125 Big Questions, enabling systematic evaluation of state-of-the-art LLMs. Results reveal gaps: low exact match (10--20%) in deep research despite step-level alignment; ideas lacking feasibility and detail; high code executability but low execution result accuracy in dry experiments; low sequence fidelity in wet protocols; and persistent multimodal comparative-reasoning challenges. We further introduce Test-Time Reinforcement Learning (TTRL), which optimizes retrieval-augmented novelty rewards at inference, enhancing hypothesis novelty without reference answer. Together, our PIM-grounded definition, workflow-centric benchmark, and empirical insights establish a foundation for AI systems that genuinely participate in scientific discovery.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

2 months ago

Benchmarks

LLM

Retrieval-Augmented Generation

AI Infra

Method/Architecture

Wanghan Xu Yuhao Zhou Yifan Zhou Qinglong Cao Shuo Li Jia Bu Bo Liu Yixin Chen Xuming He Xiangyu Zhao

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

2 months ago

Benchmarks

LLM

Retrieval-Augmented Generation

AI Infra

Method/Architecture

Wanghan Xu Yuhao Zhou Yifan Zhou Qinglong Cao Shuo Li Jia Bu Bo Liu Yixin Chen Xuming He Xiangyu Zhao

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Wanghan Xu Yuhao Zhou Yifan Zhou Qinglong Cao Shuo Li Jia Bu Bo Liu Yixin Chen Xuming He Xiangyu Zhao97 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Wanghan Xu Yuhao Zhou Yifan Zhou Qinglong Cao Shuo Li Jia Bu Bo Liu Yixin Chen Xuming He Xiangyu Zhao97 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Wanghan Xu Yuhao Zhou Yifan Zhou Qinglong Cao Shuo Li Jia Bu Bo Liu Yixin Chen Xuming He Xiangyu Zhao97 more

Abstract

Build AI with AI

HyperAI Newsletters

Wanghan Xu Yuhao Zhou Yifan Zhou Qinglong Cao Shuo Li Jia Bu Bo Liu Yixin Chen Xuming He Xiangyu Zhao

Wanghan Xu Yuhao Zhou Yifan Zhou Qinglong Cao Shuo Li Jia Bu Bo Liu Yixin Chen Xuming He Xiangyu Zhao

Wanghan Xu Yuhao Zhou Yifan Zhou Qinglong Cao Shuo Li Jia Bu Bo Liu Yixin Chen Xuming He Xiangyu Zhao