HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization

Chen Yiyang ; Zheng Zhedong ; Ji Wei ; Qu Leigang ; Chua Tat-Seng

Composed Image Retrieval with Text Feedback via Multi-grained
  Uncertainty Regularization

Abstract

We investigate composed image retrieval with text feedback. Users graduallylook for the target of interest by moving from coarse to fine-grained feedback.However, existing methods merely focus on the latter, i.e., fine-grainedsearch, by harnessing positive and negative pairs during training. Thispair-based paradigm only considers the one-to-one distance between a pair ofspecific points, which is not aligned with the one-to-many coarse-grainedretrieval process and compromises the recall rate. In an attempt to fill thisgap, we introduce a unified learning approach to simultaneously modeling thecoarse- and fine-grained retrieval by considering the multi-graineduncertainty. The key idea underpinning the proposed method is to integratefine- and coarse-grained retrieval as matching data points with small and largefluctuations, respectively. Specifically, our method contains two modules:uncertainty modeling and uncertainty regularization. (1) The uncertaintymodeling simulates the multi-grained queries by introducing identicallydistributed fluctuations in the feature space. (2) Based on the uncertaintymodeling, we further introduce uncertainty regularization to adapt the matchingobjective according to the fluctuation range. Compared with existing methods,the proposed strategy explicitly prevents the model from pushing away potentialcandidates in the early stage, and thus improves the recall rate. On the threepublic datasets, i.e., FashionIQ, Fashion200k, and Shoes, the proposed methodhas achieved +4.03%, +3.38%, and +2.40% Recall@50 accuracy over a strongbaseline, respectively.

Code Repositories

Monoxide-Chen/uncertainty_retrieval
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-retrieval-on-fashion-iqMUR (4*ResNet50)
(Recall@10+Recall@50)/2: 50.61
image-retrieval-on-fashion-iqMUR
(Recall@10+Recall@50)/2: 47.28
image-retrieval-with-multi-modal-query-onMulti-grained Uncertainty Regularization(MUR)
Recall@1: 21.8
Recall@10: 52.1
Recall@50: 70.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization | Papers | HyperAI