HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Guided Attention for Interpretable Motion Captioning

Radouane Karim ; Lagarde Julien ; Ranwez Sylvie ; Tchechmedjiev Andon

Guided Attention for Interpretable Motion Captioning

Abstract

Diverse and extensive work has recently been conducted on text-conditionedhuman motion generation. However, progress in the reverse direction, motioncaptioning, has seen less comparable advancement. In this paper, we introduce anovel architecture design that enhances text generation quality by emphasizinginterpretability through spatio-temporal and adaptive attention mechanisms. Toencourage human-like reasoning, we propose methods for guiding attention duringtraining, emphasizing relevant skeleton areas over time and distinguishingmotion-related words. We discuss and quantify our model's interpretabilityusing relevant histograms and density distributions. Furthermore, we leverageinterpretability to derive fine-grained information about human motion,including action localization, body part identification, and the distinction ofmotion-related words. Finally, we discuss the transferability of our approachesto other tasks. Our experiments demonstrate that attention guidance leads tointerpretable captioning while enhancing performance compared to higherparameter-count, non-interpretable state-of-the-art systems. The code isavailable at: https://github.com/rd20karim/M2T-Interpretable.

Code Repositories

rd20karim/m2t-interpretable
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
motion-captioning-on-humanml3dST-MLP
BERTScore: 40.3
BLEU-4: 25.0
motion-captioning-on-kit-motion-languageST-MLP
BERTScore: 41.2
BLEU-4: 24.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Guided Attention for Interpretable Motion Captioning | Papers | HyperAI