HyperAIHyperAI

Command Palette

Search for a command to run...

AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors

Kaishen Yuan Zitong Yu Xin Liu Weicheng Xie Huanjing Yue Jingyu Yang

Abstract

Facial Action Units (AU) is a vital concept in the realm of affectivecomputing, and AU detection has always been a hot research topic. Existingmethods suffer from overfitting issues due to the utilization of a large numberof learnable parameters on scarce AU-annotated datasets or heavy reliance onsubstantial additional relevant data. Parameter-Efficient Transfer Learning(PETL) provides a promising paradigm to address these challenges, whereas itsexisting methods lack design for AU characteristics. Therefore, we innovativelyinvestigate PETL paradigm to AU detection, introducing AUFormer and proposing anovel Mixture-of-Knowledge Expert (MoKE) collaboration mechanism. An individualMoKE specific to a certain AU with minimal learnable parameters firstintegrates personalized multi-scale and correlation knowledge. Then the MoKEcollaborates with other MoKEs in the expert group to obtain aggregatedinformation and inject it into the frozen Vision Transformer (ViT) to achieveparameter-efficient AU detection. Additionally, we design a Margin-truncatedDifficulty-aware Weighted Asymmetric Loss (MDWA-Loss), which can encourage themodel to focus more on activated AUs, differentiate the difficulty ofunactivated AUs, and discard potential mislabeled samples. Extensiveexperiments from various perspectives, including within-domain, cross-domain,data efficiency, and micro-expression domain, demonstrate AUFormer'sstate-of-the-art performance and robust generalization abilities withoutrelying on additional relevant data. The code for AUFormer is available athttps://github.com/yuankaishen2001/AUFormer.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors | Papers | HyperAI