6 months ago

Renat Aksitov Sobhan Miryoosefi Zonglin Li Daliang Li Sheila Babayan Kavya Kopparapu Zachary Fisher Ruiqi Guo Sushant Prakash Pranesh Srinivasan

Abstract

Answering complex natural language questions often necessitates multi-step reasoning and integrating external information. Several systems have combined knowledge retrieval with a large language model (LLM) to answer such questions. These systems, however, suffer from various failure cases, and we cannot directly train them end-to-end to fix such failures, as interaction with external knowledge is non-differentiable. To address these deficiencies, we define a ReAct-style LLM agent with the ability to reason and act upon external knowledge. We further refine the agent through a ReST-like method that iteratively trains on previous trajectories, employing growing-batch reinforcement learning with AI feedback for continuous self-improvement and self-distillation. Starting from a prompted large model and after just two iterations of the algorithm, we can produce a fine-tuned small model that achieves comparable performance on challenging compositional question-answering benchmarks with two orders of magnitude fewer parameters.

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

6 months ago

Agent

Retrieval-Augmented Generation

Reinforcement Learning

Method/Architecture

Renat Aksitov Sobhan Miryoosefi Zonglin Li Daliang Li Sheila Babayan Kavya Kopparapu Zachary Fisher Ruiqi Guo Sushant Prakash Pranesh Srinivasan

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

6 months ago

Agent

Retrieval-Augmented Generation

Reinforcement Learning

Method/Architecture

Renat Aksitov Sobhan Miryoosefi Zonglin Li Daliang Li Sheila Babayan Kavya Kopparapu Zachary Fisher Ruiqi Guo Sushant Prakash Pranesh Srinivasan

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Renat Aksitov Sobhan Miryoosefi Zonglin Li Daliang Li Sheila Babayan Kavya Kopparapu Zachary Fisher Ruiqi Guo Sushant Prakash Pranesh Srinivasan3 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Renat Aksitov Sobhan Miryoosefi Zonglin Li Daliang Li Sheila Babayan Kavya Kopparapu Zachary Fisher Ruiqi Guo Sushant Prakash Pranesh Srinivasan3 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Renat Aksitov Sobhan Miryoosefi Zonglin Li Daliang Li Sheila Babayan Kavya Kopparapu Zachary Fisher Ruiqi Guo Sushant Prakash Pranesh Srinivasan3 more

Abstract

Build AI with AI

HyperAI Newsletters

Renat Aksitov Sobhan Miryoosefi Zonglin Li Daliang Li Sheila Babayan Kavya Kopparapu Zachary Fisher Ruiqi Guo Sushant Prakash Pranesh Srinivasan

Renat Aksitov Sobhan Miryoosefi Zonglin Li Daliang Li Sheila Babayan Kavya Kopparapu Zachary Fisher Ruiqi Guo Sushant Prakash Pranesh Srinivasan

Renat Aksitov Sobhan Miryoosefi Zonglin Li Daliang Li Sheila Babayan Kavya Kopparapu Zachary Fisher Ruiqi Guo Sushant Prakash Pranesh Srinivasan