
摘要
从单目图像或2D关节进行3D人体姿态估计是一个病态问题,因为存在深度模糊和遮挡关节的问题。我们认为,从单目输入进行3D人体姿态估计是一个逆问题,可能存在多个可行解。在本文中,我们提出了一种新颖的方法,可以从2D关节生成多个可行的3D姿态假设。与现有的基于单一高斯分布最小化均方误差的深度学习方法不同,我们的方法能够基于多模混合密度网络生成多个可行的3D姿态假设。实验结果表明,通过我们的方法从2D关节输入估计出的3D姿态在2D重投影中具有一致性,这支持了我们关于2D到3D逆问题存在多个解的观点。此外,我们在Human3.6M数据集上展示了最佳假设和多视图设置下的最先进性能,并通过在MPII和MPI-INF-3DHP数据集上的测试证明了模型的泛化能力。我们的代码可在项目网站上获取。
代码仓库
vnmr/JointVideoPose3D
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| 3d-human-pose-estimation-on-human36m | MDN (Multi-View) | Average MPJPE (mm): 49.6 Multi-View or Monocular: Multi-View Using 2D ground-truth joints: No |
| 3d-human-pose-estimation-on-human36m | MDN | Average MPJPE (mm): 52.7 Multi-View or Monocular: Monocular PA-MPJPE: 42.6 Using 2D ground-truth joints: No |
| 3d-human-pose-estimation-on-mpi-inf-3dhp | MDM | PCK: 67.9 |
| monocular-3d-human-pose-estimation-on-human3 | Multimodal Mixture Density Networks | Average MPJPE (mm): 52.7 Frames Needed: 1 Need Ground Truth 2D Pose: No Use Video Sequence: No |
| multi-hypotheses-3d-human-pose-estimation-on | MDN | Average MPJPE (mm): 52.7 Average PMPJPE (mm): 42.6 |
| multi-hypotheses-3d-human-pose-estimation-on-2 | SMPL-MDN (by 3D Multi-bodies) | Best-Hypothesis MPJPE (n = 25): 91.5 Best-Hypothesis PMPJPE (n = 25): 69.5 H36M PMPJPE (n = 1): 44.8 H36M PMPJPE (n = 25): 42.7 Most-Likely Hypothesis PMPJPE (n = 1): 74.7 |