8 months ago

Zihan Zheng Zerui Cheng Zeyu Shen Shang Zhou Kaiyuan Liu Hansen He Dongruixuan Li Stanley Wei Hangyi Hao Jianzhu Yao

Abstract

Recent reports claim that large language models (LLMs) now outperform elitehumans in competitive programming. Drawing on knowledge from a group ofmedalists in international algorithmic contests, we revisit this claim,examining how LLMs differ from human experts and where limitations stillremain. We introduce LiveCodeBench Pro, a benchmark composed of problems fromCodeforces, ICPC, and IOI that are continuously updated to reduce thelikelihood of data contamination. A team of Olympiad medalists annotates everyproblem for algorithmic categories and conducts a line-by-line analysis offailed model-generated submissions. Using this new data and benchmark, we findthat frontier models still have significant limitations: without externaltools, the best model achieves only 53% pass@1 on medium-difficulty problemsand 0% on hard problems, domains where expert humans still excel. We also findthat LLMs succeed at implementation-heavy problems but struggle with nuancedalgorithmic reasoning and complex case analysis, often generating confidentlyincorrect justifications. High performance appears largely driven byimplementation precision and tool augmentation, not superior reasoning.LiveCodeBench Pro thus highlights the significant gap to human grandmasterlevels, while offering fine-grained diagnostics to steer future improvements incode-centric LLM reasoning.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

8 months ago

Natural Language Processing

Task/Problem

Zihan Zheng Zerui Cheng Zeyu Shen Shang Zhou Kaiyuan Liu Hansen He Dongruixuan Li Stanley Wei Hangyi Hao Jianzhu Yao

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

8 months ago

Natural Language Processing

Task/Problem

Zihan Zheng Zerui Cheng Zeyu Shen Shang Zhou Kaiyuan Liu Hansen He Dongruixuan Li Stanley Wei Hangyi Hao Jianzhu Yao

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?

Zihan Zheng Zerui Cheng Zeyu Shen Shang Zhou Kaiyuan Liu Hansen He Dongruixuan Li Stanley Wei Hangyi Hao Jianzhu Yao9 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?

Zihan Zheng Zerui Cheng Zeyu Shen Shang Zhou Kaiyuan Liu Hansen He Dongruixuan Li Stanley Wei Hangyi Hao Jianzhu Yao9 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?

Zihan Zheng Zerui Cheng Zeyu Shen Shang Zhou Kaiyuan Liu Hansen He Dongruixuan Li Stanley Wei Hangyi Hao Jianzhu Yao9 more

Abstract

Build AI with AI

HyperAI Newsletters

Zihan Zheng Zerui Cheng Zeyu Shen Shang Zhou Kaiyuan Liu Hansen He Dongruixuan Li Stanley Wei Hangyi Hao Jianzhu Yao

Zihan Zheng Zerui Cheng Zeyu Shen Shang Zhou Kaiyuan Liu Hansen He Dongruixuan Li Stanley Wei Hangyi Hao Jianzhu Yao

Zihan Zheng Zerui Cheng Zeyu Shen Shang Zhou Kaiyuan Liu Hansen He Dongruixuan Li Stanley Wei Hangyi Hao Jianzhu Yao