5 months ago

Data Roaming and Quality Assessment for Composed Image Retrieval

Levy Matan ; Ben-Ari Rami ; Darshan Nir ; Lischinski Dani

Abstract

The task of Composed Image Retrieval (CoIR) involves queries that combineimage and text modalities, allowing users to express their intent moreeffectively. However, current CoIR datasets are orders of magnitude smallercompared to other vision and language (V&L) datasets. Additionally, some ofthese datasets have noticeable issues, such as queries containing redundantmodalities. To address these shortcomings, we introduce the Large ScaleComposed Image Retrieval (LaSCo) dataset, a new CoIR dataset which is ten timeslarger than existing ones. Pre-training on our LaSCo, shows a noteworthyimprovement in performance, even in zero-shot. Furthermore, we propose a newapproach for analyzing CoIR datasets and methods, which detects modalityredundancy or necessity, in queries. We also introduce a new CoIR baseline, theCross-Attention driven Shift Encoder (CASE). This baseline allows for earlyfusion of modalities using a cross-attention module and employs an additionalauxiliary task during training. Our experiments demonstrate that this newbaseline outperforms the current state-of-the-art methods on establishedbenchmarks like FashionIQ and CIRR.

Code Repositories

levymsn/LaSCo

Official

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
image-retrieval-on-cirr	CASE (Pre-trained on LaSCo.Ca)	(Recall@5+Recall_subset@1)/2: 78.25 Recall@10: 88.75
image-retrieval-on-cirr	CASE	(Recall@5+Recall_subset@1)/2: 77.5 Recall@10: 87.25
image-retrieval-on-fashion-iq	CASE	(Recall@10+Recall@50)/2: 59.73 Recall@10: 48.79
image-retrieval-on-lasco	BLIP4CIR	Recall@1 (%): 4.26
image-retrieval-on-lasco	CASE	Recall@1 (%): 7.08

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette