4 months ago

GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling

Liu Pinxin ; Song Luchuan ; Huang Junhua ; Liu Haiyang ; Xu Chenliang

Abstract

Generating full-body human gestures based on speech signals remainschallenges on quality and speed. Existing approaches model different bodyregions such as body, legs and hands separately, which fail to capture thespatial interactions between them and result in unnatural and disjointedmovements. Additionally, their autoregressive/diffusion-based pipelines showslow generation speed due to dozens of inference steps. To address these twochallenges, we propose GestureLSM, a flow-matching-based approach for Co-SpeechGesture Generation with spatial-temporal modeling. Our method i) explicitlymodel the interaction of tokenized body regions through spatial and temporalattention, for generating coherent full-body gestures. ii) introduce the flowmatching to enable more efficient sampling by explicitly modeling the latentvelocity space. To overcome the suboptimal performance of flow matchingbaseline, we propose latent shortcut learning and beta distribution time stampsampling during training to enhance gesture synthesis quality and accelerateinference. Combining the spatial-temporal modeling and improved flowmatching-based framework, GestureLSM achieves state-of-the-art performance onBEAT2 while significantly reducing inference time compared to existing methods,highlighting its potential for enhancing digital humans and embodied agents inreal-world applications. Project Page:https://andypinxinliu.github.io/GestureLSM

Code Repositories

andypinxinliu/GestureLSM

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
gesture-generation-on-beat2	GestureLSM	FGD: 0.4040

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette