5 months ago

CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

Xing Jinbo ; Xia Menghan ; Zhang Yuechen ; Cun Xiaodong ; Wang Jue ; Wong Tien-Tsin

Abstract

Speech-driven 3D facial animation has been widely studied, yet there is stilla gap to achieving realism and vividness due to the highly ill-posed nature andscarcity of audio-visual data. Existing works typically formulate thecross-modal mapping into a regression task, which suffers from theregression-to-mean problem leading to over-smoothed facial motions. In thispaper, we propose to cast speech-driven facial animation as a code query taskin a finite proxy space of the learned codebook, which effectively promotes thevividness of the generated motions by reducing the cross-modal mappinguncertainty. The codebook is learned by self-reconstruction over real facialmotions and thus embedded with realistic facial motion priors. Over thediscrete motion space, a temporal autoregressive model is employed tosequentially synthesize facial motions from the input speech signal, whichguarantees lip-sync as well as plausible facial expressions. We demonstratethat our approach outperforms current state-of-the-art methods bothqualitatively and quantitatively. Also, a user study further justifies oursuperiority in perceptual quality.

Code Repositories

Doubiiu/CodeTalker

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
3d-face-animation-on-beat2	CodeTalker	MSE: 8.026
3d-face-animation-on-biwi-3d-audiovisual	CodeTalker	FDD: 4.1170 Lip Vertex Error: 4.7914

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette