HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench

Edan Toledo Karen Hambardzumyan Martin Josifoski Rishi Hazra Nicolas Baldwin Alexis Audran-Reiss Michael Kuchnik Despoina Magka et al

AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench

Abstract

AI research agents are demonstrating great potential to accelerate scientific progress by automating the design, implementation, and training of machine learning models. We focus on methods for improving agents' performance on MLE-bench, a challenging benchmark where agents compete in Kaggle competitions to solve real-world machine learning problems. We formalize AI research agents as search policies that navigate a space of candidate solutions, iteratively modifying them using operators. By designing and systematically varying different operator sets and search policies (Greedy, MCTS, Evolutionary), we show that their interplay is critical for achieving high performance. Our best pairing of search strategy and operator set achieves a state-of-the-art result on MLE-bench lite, increasing the success rate of achieving a Kaggle medal from 39.6% to 47.7%. Our investigation underscores the importance of jointly considering the search strategy, operator design, and evaluation methodology in advancing automated machine learning.

Code Repositories

facebookresearch/aira-dojo
Official
Mentioned in GitHub

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench | Papers | HyperAI