HyperAIHyperAI

Command Palette

Search for a command to run...

REFRAG Decoding Framework

Date

2 months ago

Organization

National University of Singapore

Paper URL

2509.01092

REFRAG was proposed by Meta Superintelligence Labs in collaboration with the National University of Singapore and Rice University in September 2025. The relevant research results were published in the paper “REFRAG: Rethinking RAG based Decoding".

REFRAG is an efficient decoding framework that improves latency for Retrieval-Augmented Generation (RAG) applications through compression, perception, and expansion. REFRAG introduces several innovative improvements to the decoding process: instead of using tokens from retrieved passages as input, it leverages pre-computed and compressed segment embeddings as approximate representations, feeding these embeddings directly into the decoder. As a result, REFRAG minimizes reliance on computationally intensive token embeddings, allowing most query blocks to be compressed in the RAG setting.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
REFRAG Decoding Framework | Wiki | HyperAI