HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

3D human pose estimation in video with temporal convolutions and semi-supervised training

Dario Pavllo; Christoph Feichtenhofer; David Grangier; Michael Auli

3D human pose estimation in video with temporal convolutions and semi-supervised training

Abstract

In this work, we demonstrate that 3D poses in video can be effectively estimated with a fully convolutional model based on dilated temporal convolutions over 2D keypoints. We also introduce back-projection, a simple and effective semi-supervised training method that leverages unlabeled video data. We start with predicted 2D keypoints for unlabeled video, then estimate 3D poses and finally back-project to the input 2D keypoints. In the supervised setting, our fully-convolutional model outperforms the previous best result from the literature by 6 mm mean per-joint position error on Human3.6M, corresponding to an error reduction of 11%, and the model also shows significant improvements on HumanEva-I. Moreover, experiments with back-projection show that it comfortably outperforms previous state-of-the-art results in semi-supervised settings where labeled data is scarce. Code and models are available at https://github.com/facebookresearch/VideoPose3D

Code Repositories

philipNoonan/OPVP3D
pytorch
Mentioned in GitHub
facebookresearch/VideoPose3D
Official
pytorch
Mentioned in GitHub
garyzhao/SemGCN
pytorch
Mentioned in GitHub
zhimingzo/modulated-gcn
pytorch
Mentioned in GitHub
sjtuxcx/ITES
pytorch
Mentioned in GitHub
happyvictor008/High-order-GNN-LF-iter
pytorch
Mentioned in GitHub
raymondyeh07/chirality_nets
pytorch
Mentioned in GitHub
ailingzengzzz/Split-and-Recombine-Net
pytorch
Mentioned in GitHub
vnmr/JointVideoPose3D
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-human-pose-estimation-on-human36mVideoPose3D (T=243)
Average MPJPE (mm): 46.8
Multi-View or Monocular: Monocular
PA-MPJPE: 36.5
Using 2D ground-truth joints: No
3d-human-pose-estimation-on-human36mVideoPose3D (T=1)
Average MPJPE (mm): 51.8
Multi-View or Monocular: Monocular
PA-MPJPE: 40
Using 2D ground-truth joints: No
monocular-3d-human-pose-estimation-on-human3VideoPose3D (T=243)
2D detector: CPN
Average MPJPE (mm): 46.8
Frames Needed: 243
Need Ground Truth 2D Pose: No
Use Video Sequence: Yes
weakly-supervised-3d-human-pose-estimation-onVideoPose3D (T=243)
Number of Frames Per View: 243
weakly-supervised-3d-human-pose-estimation-onPavllo et al.
3D Annotations: S1
Average MPJPE (mm): 64.7
Number of Views: 1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
3D human pose estimation in video with temporal convolutions and semi-supervised training | Papers | HyperAI