HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Consistency-guided Prompt Learning for Vision-Language Models

Shuvendu Roy Ali Etemad

Consistency-guided Prompt Learning for Vision-Language Models

Abstract

We propose Consistency-guided Prompt learning (CoPrompt), a new fine-tuning method for vision-language models. Our approach improves the generalization of large foundation models when fine-tuned on downstream tasks in a few-shot setting. The basic idea of CoPrompt is to enforce a consistency constraint in the prediction of the trainable and pre-trained models to prevent overfitting on the downstream task. Additionally, we introduce the following two components into our consistency constraint to further boost the performance: enforcing consistency on two perturbed inputs and combining two dominant paradigms of tuning, prompting and adapter. Enforcing consistency on perturbed input serves to further regularize the consistency constraint, thereby improving generalization. Moreover, the integration of adapters and prompts not only enhances performance on downstream tasks but also offers increased tuning flexibility in both input and output spaces. This facilitates more effective adaptation to downstream tasks in a few-shot learning setting. Experiments show that CoPrompt outperforms existing methods on a range of evaluation suites, including base-to-novel generalization, domain generalization, and cross-dataset evaluation. On generalization, CoPrompt improves the state-of-the-art on zero-shot tasks and the overall harmonic mean over 11 datasets. Detailed ablation studies show the effectiveness of each of the components in CoPrompt. We make our code available at https://github.com/ShuvenduRoy/CoPrompt.

Code Repositories

shuvenduroy/coprompt
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
prompt-engineering-on-caltech-101CoPrompt
Harmonic mean: 96.55
prompt-engineering-on-dtdCoPrompt
Harmonic mean: 72.79
prompt-engineering-on-eurosatCoPrompt
Harmonic mean: 85.84
prompt-engineering-on-fgvc-aircraftCoPrompt
Harmonic mean: 39.76
prompt-engineering-on-food-101CoPrompt
Harmonic mean: 91.40
prompt-engineering-on-imagenetCoPrompt
Harmonic mean: 74.33
prompt-engineering-on-imagenet-aCoPrompt
Top-1 accuracy %: 50.50
prompt-engineering-on-imagenet-rCoPrompt
Top-1 accuracy %: 77.51
prompt-engineering-on-imagenet-sCoPrompt
Top-1 accuracy %: 49.43
prompt-engineering-on-oxford-102-flowerCoPrompt
Harmonic mean: 85.71
prompt-engineering-on-oxford-iiit-pet-datasetCoPrompt
Harmonic mean: 96.87
prompt-engineering-on-stanford-cars-1CoPrompt
Harmonic mean: 75.66
prompt-engineering-on-sun397CoPrompt
Harmonic mean: 81.31
prompt-engineering-on-ucf101CoPrompt
Harmonic mean: 83.07

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Consistency-guided Prompt Learning for Vision-Language Models | Papers | HyperAI