5 months ago

Stacked Dense U-Nets with Dual Transformers for Robust Face Alignment

Jia Guo; Jiankang Deng; Niannan Xue; Stefanos Zafeiriou

Abstract

Facial landmark localisation in images captured in-the-wild is an important and challenging problem. The current state-of-the-art revolves around certain kinds of Deep Convolutional Neural Networks (DCNNs) such as stacked U-Nets and Hourglass networks. In this work, we innovatively propose stacked dense U-Nets for this task. We design a novel scale aggregation network topology structure and a channel aggregation building block to improve the model's capacity without sacrificing the computational complexity and model size. With the assistance of deformable convolutions inside the stacked dense U-Nets and coherent loss for outside data transformation, our model obtains the ability to be spatially invariant to arbitrary input face images. Extensive experiments on many in-the-wild datasets, validate the robustness of the proposed method under extreme poses, exaggerated expressions and heavy occlusions. Finally, we show that accurate 3D face alignment can assist pose-invariant face recognition where we achieve a new state-of-the-art accuracy on CFP-FP.

Code Repositories

deepinsight/insightface

Official

pytorch

deepinx/SDU_face_alignment

mxnet

Mentioned in GitHub

deepinx/sdu-face-alignment

mxnet

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
face-alignment-on-cofw	DenseU-Net + Dual Transformer	NME (inter-pupil): 5.55%
face-alignment-on-ibug	DenseU-Net + Dual Transformer	Mean Error Rate: 6.73%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette