HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Learning Individual Styles of Conversational Gesture

Shiry Ginosar; Amir Bar; Gefen Kohavi; Caroline Chan; Andrew Owens; Jitendra Malik

Learning Individual Styles of Conversational Gesture

Abstract

Human speech is often accompanied by hand and arm gestures. Given audio speech input, we generate plausible gestures to go along with the sound. Specifically, we perform cross-modal translation from "in-the-wild'' monologue speech of a single speaker to their hand and arm motion. We train on unlabeled videos for which we only have noisy pseudo ground truth from an automatic pose detection system. Our proposed model significantly outperforms baseline methods in a quantitative comparison. To support research toward obtaining a computational understanding of the relationship between gesture and speech, we release a large video dataset of person-specific gestures. The project website with video, code and data can be found at http://people.eecs.berkeley.edu/~shiry/speech2gesture .

Code Repositories

PantoMatrix/BEAT
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
gesture-generation-on-beatSpeech2Gestures
FID: 256.7
gesture-generation-on-beat2S2G
FGD: 2.815

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Learning Individual Styles of Conversational Gesture | Papers | HyperAI