HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement

Richard Alexander ; Zollhoefer Michael ; Wen Yandong ; de la Torre Fernando ; Sheikh Yaser

MeshTalk: 3D Face Animation from Speech using Cross-Modality
  Disentanglement

Abstract

This paper presents a generic method for generating full facial 3D animationfrom speech. Existing approaches to audio-driven facial animation exhibituncanny or static upper face animation, fail to produce accurate and plausibleco-articulation or rely on person-specific models that limit their scalability.To improve upon existing models, we propose a generic audio-driven facialanimation approach that achieves highly realistic motion synthesis results forthe entire face. At the core of our approach is a categorical latent space forfacial animation that disentangles audio-correlated and audio-uncorrelatedinformation based on a novel cross-modality loss. Our approach ensures highlyaccurate lip motion, while also synthesizing plausible animation of the partsof the face that are uncorrelated to the audio signal, such as eye blinks andeye brow motion. We demonstrate that our approach outperforms several baselinesand obtains state-of-the-art quality both qualitatively and quantitatively. Aperceptual user study demonstrates that our approach is deemed more realisticthan the current state-of-the-art in over 75% of cases. We recommend watchingthe supplemental video before reading the paper:https://github.com/facebookresearch/meshtalk

Code Repositories

facebookresearch/multiface
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-face-animation-on-vocasetMeshTalk
Lip Vertex Error: 6.7436

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement | Papers | HyperAI