HyperAIHyperAI

Command Palette

Search for a command to run...

2 months ago

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Abstract

In this paper, we introduce a novel learning paradigm for adaptive LargeLanguage Model (LLM) agents that eliminates the need for fine-tuning theunderlying LLMs. Existing approaches are often either rigid, relying on static,handcrafted reflection workflows, or computationally intensive, requiringgradient updates of LLM model parameters. In contrast, our method enableslow-cost continual adaptation via memory-based online reinforcement learning.We formalise this as a Memory-augmented Markov Decision Process (M-MDP),equipped with a neural case-selection policy to guide action decisions. Pastexperiences are stored in an episodic memory, either differentiable ornon-parametric. The policy is continually updated based on environmentalfeedback through a memory rewriting mechanism, whereas policy improvement isachieved through efficient memory reading (retrieval). We instantiate our agentmodel in the deep research setting, namely AgentFly, which attains top-1 onGAIA validation (87.88% Pass@3) and 79.40% on the test set. It reaches66.6% F1 and 80.4% PM on the DeepResearcher dataset, outperforming thestate-of-the-art training-based method, while case-based memory adds 4.7% to9.6% absolute points on out-of-distribution tasks. Our approach offers ascalable and efficient pathway for developing generalist LLM agents capable ofcontinuous, real-time learning without gradient updates, advancing machinelearning towards open-ended skill acquisition and deep research scenarios. Thecode is available at https://github.com/Agent-on-the-Fly/AgentFly.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs | Papers | HyperAI