HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

Xing Jinbo ; Xia Menghan ; Zhang Yuechen ; Cun Xiaodong ; Wang Jue ; Wong Tien-Tsin

CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

Abstract

Speech-driven 3D facial animation has been widely studied, yet there is stilla gap to achieving realism and vividness due to the highly ill-posed nature andscarcity of audio-visual data. Existing works typically formulate thecross-modal mapping into a regression task, which suffers from theregression-to-mean problem leading to over-smoothed facial motions. In thispaper, we propose to cast speech-driven facial animation as a code query taskin a finite proxy space of the learned codebook, which effectively promotes thevividness of the generated motions by reducing the cross-modal mappinguncertainty. The codebook is learned by self-reconstruction over real facialmotions and thus embedded with realistic facial motion priors. Over thediscrete motion space, a temporal autoregressive model is employed tosequentially synthesize facial motions from the input speech signal, whichguarantees lip-sync as well as plausible facial expressions. We demonstratethat our approach outperforms current state-of-the-art methods bothqualitatively and quantitatively. Also, a user study further justifies oursuperiority in perceptual quality.

Code Repositories

Doubiiu/CodeTalker
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-face-animation-on-beat2CodeTalker
MSE: 8.026
3d-face-animation-on-biwi-3d-audiovisualCodeTalker
FDD: 4.1170
Lip Vertex Error: 4.7914

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior | Papers | HyperAI