HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Xinyi He Qian Liu Mingzhe Du Lin Yan Zhijie Fan Yiming Huang Zejian Yuan Zejun Ma

SWE-Perf: Can Language Models Optimize Code Performance on Real-World
  Repositories?

Abstract

Code performance optimization is paramount in real-world software engineeringand critical for production-level systems. While Large Language Models (LLMs)have demonstrated impressive capabilities in code generation and bug fixing,their proficiency in enhancing code performance at the repository level remainslargely unexplored. To address this gap, we introduce SWE-Perf, the firstbenchmark specifically designed to systematically evaluate LLMs on codeperformance optimization tasks within authentic repository contexts. SWE-Perfcomprises 140 carefully curated instances, each derived fromperformance-improving pull requests from popular GitHub repositories. Eachbenchmark instance includes the relevant codebase, target functions,performance-related tests, expert-authored patches, and executableenvironments. Through a comprehensive evaluation of representative methods thatspan file-level and repo-level approaches (e.g., Agentless and OpenHands), wereveal a substantial capability gap between existing LLMs and expert-leveloptimization performance, highlighting critical research opportunities in thisemerging field.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? | Papers | HyperAI