HyperAIHyperAI

Command Palette

Search for a command to run...

Retrieval-Augmented Perception

Date

5 months ago

Tags

The Retrieval-Augmented Perception (RAP) plug-in was proposed by a team from Nanyang Technological University and Wuhan University in March 2025. The relevant research results were published in the paper “Retrieval-Augmented Perception: High-Resolution Image Perception Meets Visual RAG", this work has been included in ICML 2025 and was rated as a Spotlight paper.

RAP is a high-resolution image perception plug-in based on RAG technology that does not require training. It aims to improve the performance of MLLMs in high-resolution image perception tasks while reducing computational costs. This enables the model to have stronger understanding, contextual awareness, and reasoning capabilities in complex environments. Experimental results show that RAP significantly improves performance in multiple high-resolution image benchmarks. For example, LLaVA-v1.5-13B improves performance by 43% on V* Bench and 19% on HR-Bench.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Retrieval-Augmented Perception | Wiki | HyperAI