HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent
  Distillation and Agentic RL

Abstract

Recent advances in large language models (LLMs) and multi-agent systems havedemonstrated remarkable capabilities in complex problem-solving tasks such asdeep research, vibe coding, and mathematical reasoning. However, most existingmulti-agent systems are built upon manual prompt/workflow engineering withsophisticated agent frameworks, making them computationally inefficient, lesscapable, and can not benefit from data-centric learning. In this work, weintroduce Chain-of-Agents (CoA), a novel paradigm of LLM reasoning that enablesnative end-to-end complex problem-solving in the same way as a multi-agentsystem (i.e., multi-turn problem solving with multiple tools and multipleagents) within one model. In chain-of-agents problem-solving, the modeldynamically activates different tool agents and role-playing agents to simulatemulti-agent collaboration in an end-to-end fashion. To elicit end-to-endchain-of-agents problem-solving abilities in LLMs, we introduce a multi-agentdistillation framework to distill state-of-the-art multi-agent systems intochain-of-agents trajectories for agentic supervised fine-tuning. We then useagentic reinforcement learning on verifiable agentic tasks to further improvethe models' capabilities on chain-of-agents problem solving. We call theresulting models Agent Foundation Models (AFMs). Our empirical studiesdemonstrate that AFM establishes new state-of-the-art performance acrossdiverse benchmarks in both web agent and code agent settings. We make theentire research, including the model weights, code for training and evaluation,and the training data, fully open-sourced, which offers a solid starting pointfor future research on agent models and agentic RL.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL | Papers | HyperAI