HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Learning Semantic-Aligned Feature Representation for Text-based Person Search

Li Shiping ; Cao Min ; Zhang Min

Learning Semantic-Aligned Feature Representation for Text-based Person
  Search

Abstract

Text-based person search aims to retrieve images of a certain pedestrian by atextual description. The key challenge of this task is to eliminate theinter-modality gap and achieve the feature alignment across modalities. In thispaper, we propose a semantic-aligned embedding method for text-based personsearch, in which the feature alignment across modalities is achieved byautomatically learning the semantic-aligned visual features and textualfeatures. First, we introduce two Transformer-based backbones to encode robustfeature representations of the images and texts. Second, we design asemantic-aligned feature aggregation network to adaptively select and aggregatefeatures with the same semantics into part-aware features, which is achieved bya multi-head attention module constrained by a cross-modality part alignmentloss and a diversity loss. Experimental results on the CUHK-PEDES and Flickr30Kdatasets show that our method achieves state-of-the-art performances.

Code Repositories

reallsp/SAF
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
nlp-based-person-retrival-on-cuhk-pedesSAF
R@1: 64.13
R@10: 88.4
R@5: 82.62

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Learning Semantic-Aligned Feature Representation for Text-based Person Search | Papers | HyperAI