Command Palette
Search for a command to run...
Yinglin Zheng Hao Yang Ting Zhang Jianmin Bao Dongdong Chen Yangyu Huang Lu Yuan Dong Chen Ming Zeng Fang Wen

Abstract
How to learn a universal facial representation that boosts all face analysis tasks? This paper takes one step toward this goal. In this paper, we study the transfer performance of pre-trained models on face analysis tasks and introduce a framework, called FaRL, for general Facial Representation Learning in a visual-linguistic manner. On one hand, the framework involves a contrastive loss to learn high-level semantic meaning from image-text pairs. On the other hand, we propose exploring low-level information simultaneously to further enhance the face representation, by adding a masked image modeling. We perform pre-training on LAION-FACE, a dataset containing large amount of face image-text pairs, and evaluate the representation capability on multiple downstream tasks. We show that FaRL achieves better transfer performance compared with previous pre-trained models. We also verify its superiority in the low-data regime. More importantly, our model surpasses the state-of-the-art methods on face analysis tasks including face parsing and face alignment.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| face-alignment-on-300w | FaRL-B (epoch 16) | NME_inter-ocular (%, Challenge): 4.45 NME_inter-ocular (%, Common): 2.56 NME_inter-ocular (%, Full): 2.93 NME_inter-pupil (%, Challenge): 6.42 NME_inter-pupil (%, Common): 3.53 NME_inter-pupil (%, Full): 4.11 |
| face-alignment-on-300w | FaRL-B (epoch 64) | NME_inter-ocular (%, Challenge): 4.42 NME_inter-ocular (%, Common): 2.50 NME_inter-ocular (%, Full): 2.88 NME_inter-pupil (%, Challenge): 6.38 NME_inter-pupil (%, Common): 3.46 NME_inter-pupil (%, Full): 4.05 |
| face-alignment-on-aflw-19 | FaRL-B (epoch 16) | AUC_box@0.07 (%, Full): 81.3 NME_box (%, Full): 1.334 NME_diag (%, Frontal): 0.821 NME_diag (%, Full): 0.943 |
| face-alignment-on-wfw-extra-data | FaRL-B (epoch 16) | AUC@10 (inter-ocular): 61.16 FR@10 (inter-ocular): 1.76 NME (inter-ocular): 3.96 |
| face-parsing-on-celebamask-hq | FaRL-B | Mean F1: 89.56 |
| face-parsing-on-lapa | FaRL-B | Mean F1: 93.88 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.