HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Read-only Prompt Optimization for Vision-Language Few-shot Learning

Dongjun Lee Seokwon Song Jihee Suh Joonmyung Choi Sanghyeok Lee Hyunwoo J.Kim

Read-only Prompt Optimization for Vision-Language Few-shot Learning

Abstract

In recent years, prompt tuning has proven effective in adapting pre-trained vision-language models to downstream tasks. These methods aim to adapt the pre-trained models by introducing learnable prompts while keeping pre-trained weights frozen. However, learnable prompts can affect the internal representation within the self-attention module, which may negatively impact performance variance and generalization, especially in data-deficient settings. To address these issues, we propose a novel approach, Read-only Prompt Optimization (RPO). RPO leverages masked attention to prevent the internal representation shift in the pre-trained model. Further, to facilitate the optimization of RPO, the read-only prompts are initialized based on special tokens of the pre-trained model. Our extensive experiments demonstrate that RPO outperforms CLIP and CoCoOp in base-to-new generalization and domain generalization while displaying better robustness. Also, the proposed method achieves better generalization on extremely data-deficient settings, while improving parameter efficiency and computational overhead. Code is available at https://github.com/mlvlab/RPO.

Code Repositories

mlvlab/rpo
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
prompt-engineering-on-caltech-101RPO
Harmonic mean: 96.03
prompt-engineering-on-dtdRPO
Harmonic mean: 68.61
prompt-engineering-on-eurosatRPO
Harmonic mean: 76.79
prompt-engineering-on-fgvc-aircraftRPO
Harmonic mean: 35.70
prompt-engineering-on-food-101RPO
Harmonic mean: 90.58
prompt-engineering-on-imagenetRPO
Harmonic mean: 74.00
prompt-engineering-on-oxford-102-flowerRPO
Harmonic mean: 84.50
prompt-engineering-on-oxford-iiit-pet-datasetRPO
Harmonic mean: 96.05
prompt-engineering-on-stanford-cars-1RPO
Harmonic mean: 74.69
prompt-engineering-on-sun397RPO
Harmonic mean: 79.18
prompt-engineering-on-ucf101RPO
Harmonic mean: 79.34

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Read-only Prompt Optimization for Vision-Language Few-shot Learning | Papers | HyperAI