HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Bi-directional Training for Composed Image Retrieval via Text Prompt Learning

Liu Zheyuan ; Sun Weixuan ; Hong Yicong ; Teney Damien ; Gould Stephen

Bi-directional Training for Composed Image Retrieval via Text Prompt
  Learning

Abstract

Composed image retrieval searches for a target image based on a multi-modaluser query comprised of a reference image and modification text describing thedesired changes. Existing approaches to solving this challenging task learn amapping from the (reference image, modification text)-pair to an imageembedding that is then matched against a large image corpus. One area that hasnot yet been explored is the reverse direction, which asks the question, whatreference image when modified as described by the text would produce the giventarget image? In this work we propose a bi-directional training scheme thatleverages such reversed queries and can be applied to existing composed imageretrieval architectures with minimum changes, which improves the performance ofthe model. To encode the bi-directional query we prepend a learnable token tothe modification text that designates the direction of the query and thenfinetune the parameters of the text embedding module. We make no other changesto the network architecture. Experiments on two standard datasets show that ournovel approach achieves improved performance over a baseline BLIP-based modelthat itself already achieves competitive performance. Our code is released athttps://github.com/Cuberick-Orion/Bi-Blip4CIR.

Code Repositories

Cuberick-Orion/Bi-Blip4CIR
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-retrieval-on-cirrBLIP4CIR+Bi
(Recall@5+Recall_subset@1)/2: 72.59
Recall@10: 83.88
image-retrieval-on-fashion-iqBLIP4CIR+Bi
(Recall@10+Recall@50)/2: 55.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Bi-directional Training for Composed Image Retrieval via Text Prompt Learning | Papers | HyperAI