HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Generating Holistic 3D Human Motion from Speech

Yi Hongwei ; Liang Hualin ; Liu Yifei ; Cao Qiong ; Wen Yandong ; Bolkart Timo ; Tao Dacheng ; Black Michael J.

Generating Holistic 3D Human Motion from Speech

Abstract

This work addresses the problem of generating 3D holistic body motions fromhuman speech. Given a speech recording, we synthesize sequences of 3D bodyposes, hand gestures, and facial expressions that are realistic and diverse. Toachieve this, we first build a high-quality dataset of 3D holistic body mesheswith synchronous speech. We then define a novel speech-to-motion generationframework in which the face, body, and hands are modeled separately. Theseparated modeling stems from the fact that face articulation stronglycorrelates with human speech, while body poses and hand gestures are lesscorrelated. Specifically, we employ an autoencoder for face motions, and acompositional vector-quantized variational autoencoder (VQ-VAE) for the bodyand hand motions. The compositional VQ-VAE is key to generating diverseresults. Additionally, we propose a cross-conditional autoregressive model thatgenerates body poses and hand gestures, leading to coherent and realisticmotions. Extensive experiments and user studies demonstrate that our proposedapproach achieves state-of-the-art performance both qualitatively andquantitatively. Our novel dataset and code will be released for researchpurposes at https://talkshow.is.tue.mpg.de.

Code Repositories

yhw-yhw/show
pytorch
Mentioned in GitHub
zhenglinzhou/headstudio
jax
Mentioned in GitHub
yhw-yhw/talkshow
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-face-animation-on-beat2TalkShow
MSE: 7.791
gesture-generation-on-beat2TalkShow
FGD: 0.6209

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Generating Holistic 3D Human Motion from Speech | Papers | HyperAI