Command Palette
Search for a command to run...
Jialong Zuo; Jiahao Hong; Feng Zhang; Changqian Yu; Hanyu Zhou; Changxin Gao; Nong Sang; Jingdong Wang

Abstract
Language-image pre-training is an effective technique for learning powerful representations in general domains. However, when directly turning to person representation learning, these general pre-training methods suffer from unsatisfactory performance. The reason is that they neglect critical person-related characteristics, i.e., fine-grained attributes and identities. To address this issue, we propose a novel language-image pre-training framework for person representation learning, termed PLIP. Specifically, we elaborately design three pretext tasks: 1) Text-guided Image Colorization, aims to establish the correspondence between the person-related image regions and the fine-grained color-part textual phrases. 2) Image-guided Attributes Prediction, aims to mine fine-grained attribute information of the person body in the image; and 3) Identity-based Vision-Language Contrast, aims to correlate the cross-modal representations at the identity level rather than the instance level. Moreover, to implement our pre-train framework, we construct a large-scale person dataset with image-text pairs named SYNTH-PEDES by automatically generating textual annotations. We pre-train PLIP on SYNTH-PEDES and evaluate our models by spanning downstream person-centric tasks. PLIP not only significantly improves existing methods on all these tasks, but also shows great ability in the zero-shot and domain generalization settings. The code, dataset and weights will be released at~\url{https://github.com/Zplusdragon/PLIP}
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| nlp-based-person-retrival-on-cuhk-pedes | PLIP-RN50 | R@1: 69.23 R@10: 91.16 R@5: 85.84 |
| person-re-identification-on-dukemtmc-reid | PLIP-RN50-MGN | mAP: 81.7 |
| person-re-identification-on-market-1501 | PLIP-RN50-ABDNet | mAP: 91.2 |
| text-based-person-retrieval-on-icfg-pedes | PLIP-RN50 | R@1: 64.25 R@10: 86.32 R@5: 80.88 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.