HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

TesseTrack: End-to-End Learnable Multi-Person Articulated 3D Pose Tracking

{Srinivasa Narasimhan Jayan Eledath Leonid Pischulini Laurent Guigues N. Dinesh Reddy}

TesseTrack: End-to-End Learnable Multi-Person Articulated 3D Pose Tracking

Abstract

We consider the task of 3D pose estimation and tracking of multiple people seen in an arbitrary number of camera feeds. We propose TesseTrack, a novel top-down approach that simultaneously reasons about multiple individuals’ 3D body joint reconstructions and associations in space and time in a single end-to-end learnable framework. At the core of our approach is a novel spatio-temporal formulation that operates in a common voxelized feature space aggregated from single- or multiple camera views. After a person detection step, a 4D CNN produces short-term person-specific representations which are then linked across time by a differentiable matcher. The linked descriptions are then merged and deconvolved into 3D poses. This joint spatio-temporal formulation contrasts with previous piece-wise strategies that treat 2D pose estimation, 2D-to-3D lifting, and 3D pose tracking as independent sub-problems that are error-prone when solved in isolation. Furthermore, unlike previous methods, TesseTrack is robust to changes in the number of camera views and achieves very good results even if a single view is available at inference time. Quantitative evaluation of 3D pose reconstruction accuracy on standard benchmarks shows significant improvements over the state of the art. Evaluation of multi-person articulated 3D pose tracking in our novel evaluation framework demonstrates the superiority of TesseTrack over strong baselines.

Benchmarks

BenchmarkMethodologyMetrics
3d-human-pose-estimation-on-cmu-panopticTesseTrack Multi-View (5 views)
Average MPJPE (mm): 7.3
3d-human-pose-estimation-on-cmu-panopticTesseTrack Monocular
Average MPJPE (mm): 18.9
3d-human-pose-estimation-on-human36mTesseTrack (Monocular)
Average MPJPE (mm): 44.6
Multi-View or Monocular: Monocular
Using 2D ground-truth joints: No
3d-human-pose-estimation-on-human36mTesseTrack (Multi-View)
Average MPJPE (mm): 18.7
Multi-View or Monocular: Multi-View
Using 2D ground-truth joints: No
3d-human-pose-tracking-on-cmu-panopticTesseTrack
3DMOTA: 94.1
3d-multi-person-pose-estimation-on-campusTesseTrack
PCP3D: 97.4
3d-multi-person-pose-estimation-on-cmuTesseTrack
Average MPJPE (mm): 7.3
3d-multi-person-pose-estimation-on-shelfTesseTrack (paper)
PCP3D: 98.2
3d-multi-person-pose-estimation-on-shelfTesseTrack (correct)
PCP3D: 97.9
3d-pose-estimation-on-human3-6mTesseTrack
Average MPJPE (mm): 18.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
TesseTrack: End-to-End Learnable Multi-Person Articulated 3D Pose Tracking | Papers | HyperAI