HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Sentence-level Prompts Benefit Composed Image Retrieval

Bai Yang ; Xu Xinxing ; Liu Yong ; Khan Salman ; Khan Fahad ; Zuo Wangmeng ; Goh Rick Siow Mong ; Feng Chun-Mei

Sentence-level Prompts Benefit Composed Image Retrieval

Abstract

Composed image retrieval (CIR) is the task of retrieving specific images byusing a query that involves both a reference image and a relative caption. Mostexisting CIR models adopt the late-fusion strategy to combine visual andlanguage features. Besides, several approaches have also been suggested togenerate a pseudo-word token from the reference image, which is furtherintegrated into the relative caption for CIR. However, these pseudo-word-basedprompting methods have limitations when target image encompasses complexchanges on reference image, e.g., object removal and attribute modification. Inthis work, we demonstrate that learning an appropriate sentence-level promptfor the relative caption (SPRC) is sufficient for achieving effective composedimage retrieval. Instead of relying on pseudo-word-based prompts, we propose toleverage pretrained V-L models, e.g., BLIP-2, to generate sentence-levelprompts. By concatenating the learned sentence-level prompt with the relativecaption, one can readily use existing text-based image retrieval models toenhance CIR performance. Furthermore, we introduce both image-text contrastiveloss and text prompt alignment loss to enforce the learning of suitablesentence-level prompts. Experiments show that our proposed method performsfavorably against the state-of-the-art CIR methods on the Fashion-IQ and CIRRdatasets. The source code and pretrained model are publicly available athttps://github.com/chunmeifeng/SPRC

Code Repositories

chunmeifeng/sprc
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
image-retrieval-on-cirrSPRC2
(Recall@5+Recall_subset@1)/2: 82.66
Recall@10: 90.39
image-retrieval-on-cirrSPRC
(Recall@5+Recall_subset@1)/2: 81.39
Recall@10: 89.74
image-retrieval-on-fashion-iqSPRC
(Recall@10+Recall@50)/2: 64.85
Recall@10: 54.92

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Sentence-level Prompts Benefit Composed Image Retrieval | Papers | HyperAI