Command Palette
Search for a command to run...
Zhu Yue ; Samet Nermin ; Picard David

Abstract
We present a benchmark for 3D human whole-body pose estimation, whichinvolves identifying accurate 3D keypoints on the entire human body, includingface, hands, body, and feet. Currently, the lack of a fully annotated andaccurate 3D whole-body dataset results in deep networks being trainedseparately on specific body parts, which are combined during inference. Or theyrely on pseudo-groundtruth provided by parametric body models which are not asaccurate as detection based methods. To overcome these issues, we introduce theHuman3.6M 3D WholeBody (H3WB) dataset, which provides whole-body annotationsfor the Human3.6M dataset using the COCO Wholebody layout. H3WB comprises 133whole-body keypoint annotations on 100K images, made possible by our newmulti-view pipeline. We also propose three tasks: i) 3D whole-body pose liftingfrom 2D complete whole-body pose, ii) 3D whole-body pose lifting from 2Dincomplete whole-body pose, and iii) 3D whole-body pose estimation from asingle RGB image. Additionally, we report several baselines from popularmethods for these tasks. Furthermore, we also provide automated 3D whole-bodyannotations of TotalCapture and experimentally show that when used with H3WB ithelps to improve the performance. Code and dataset is available athttps://github.com/wholebody3d/wholebody3d
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-facial-landmark-localization-on-h3wb | SimpleBaseline | Average MPJPE (mm): 34.0 |
| 3d-facial-landmark-localization-on-h3wb | CanonPose | Average MPJPE (mm): 31.9 |
| 3d-facial-landmark-localization-on-h3wb | SHN + SimpleBaseline | Average MPJPE (mm): 32.5 |
| 3d-facial-landmark-localization-on-h3wb | CPN + Jointformer | Average MPJPE (mm): 20.7 |
| 3d-facial-landmark-localization-on-h3wb | Jointformer | Average MPJPE (mm): 19.8 |
| 3d-facial-landmark-localization-on-h3wb | SimpleBaseline | Average MPJPE (mm): 24.6 |
| 3d-facial-landmark-localization-on-h3wb | CanonPose | Average MPJPE (mm): 24.6 |
| 3d-facial-landmark-localization-on-h3wb | CanonPose + 3D supervision | Average MPJPE (mm): 22.2 |
| 3d-facial-landmark-localization-on-h3wb | Resnet50 | Average MPJPE (mm): 26.3 |
| 3d-facial-landmark-localization-on-h3wb | Large SimpleBaseline | Average MPJPE (mm): 14.6 |
| 3d-facial-landmark-localization-on-h3wb | Large SimpleBaseline | Average MPJPE (mm): 19.8 |
| 3d-facial-landmark-localization-on-h3wb | CanonPose + 3D supervision | Average MPJPE (mm): 17.9 |
| 3d-facial-landmark-localization-on-h3wb | Jointformer | Average MPJPE (mm): 17.8 |
| 3d-hand-pose-estimation-on-h3wb | Large SimpleBaseline | Average MPJPE (mm): 44.8 |
| 3d-hand-pose-estimation-on-h3wb | CPN + Jointformer | Average MPJPE (mm): 56.9 |
| 3d-hand-pose-estimation-on-h3wb | SimpleBaseline | Average MPJPE (mm): 83.4 |
| 3d-hand-pose-estimation-on-h3wb | Large SimpleBaseline | Average MPJPE (mm): 31.7 |
| 3d-hand-pose-estimation-on-h3wb | Jointformer | Average MPJPE (mm): 43.7 |
| 3d-hand-pose-estimation-on-h3wb | SimpleBaseline | Average MPJPE (mm): 42.5 |
| 3d-hand-pose-estimation-on-h3wb | SHN + SimpleBaseline | Average MPJPE (mm): 64.3 |
| 3d-hand-pose-estimation-on-h3wb | CanonPose + 3D supervision | Average MPJPE (mm): 47.4 |
| 3d-hand-pose-estimation-on-h3wb | CanonPose | Average MPJPE (mm): 48.9 |
| 3d-hand-pose-estimation-on-h3wb | Resnet50 | Average MPJPE (mm): 63.1 |
| 3d-hand-pose-estimation-on-h3wb | Jointformer | Average MPJPE (mm): 53.5 |
| 3d-hand-pose-estimation-on-h3wb | CanonPose | Average MPJPE (mm): 56.2 |
| 3d-hand-pose-estimation-on-h3wb | CanonPose + 3D supervision | Average MPJPE (mm): 38.3 |
| 3d-human-pose-estimation-on-h3wb | CanonPose | MPJPE: 193.7 |
| 3d-human-pose-estimation-on-h3wb | CPN + Jointformer | MPJPE: 142.8 |
| 3d-human-pose-estimation-on-h3wb | Jointformer | MPJPE: 103.0 |
| 3d-human-pose-estimation-on-h3wb | Large SimpleBaseline | MPJPE: 131.6 |
| 3d-human-pose-estimation-on-h3wb | CanonPose | MPJPE: 264.4 |
| 3d-human-pose-estimation-on-h3wb | CanonPose + 3D supervision | MPJPE: 155.9 |
| 3d-human-pose-estimation-on-h3wb | Large SimpleBaseline | MPJPE: 112.6 |
| 3d-human-pose-estimation-on-h3wb | CanonPose + 3D supervision | MPJPE: 117.5 |
| 3d-human-pose-estimation-on-h3wb | Resnet50 | MPJPE: 151.6 |
| 3d-human-pose-estimation-on-h3wb | SimpleBaseline | MPJPE: 252.0 |
| 3d-human-pose-estimation-on-h3wb | Jointformer | MPJPE: 84.9 |
| 3d-human-pose-estimation-on-h3wb | SHN + SimpleBaseline | MPJPE: 189.6 |
| 3d-human-pose-estimation-on-h3wb | SimpleBaseline | MPJPE: 125.7 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.