Command Palette
Search for a command to run...
Non-Local Latent Relation Distillation for Self-Adaptive 3D Human Pose Estimation
Jogendra Nath Kundu; Siddharth Seth; Anirudh Jamkhandi; Pradyumna YM; Varun Jampani; Anirban Chakraborty; R. Venkatesh Babu

Abstract
Available 3D human pose estimation approaches leverage different forms of strong (2D/3D pose) or weak (multi-view or depth) paired supervision. Barring synthetic or in-studio domains, acquiring such supervision for each new target environment is highly inconvenient. To this end, we cast 3D pose learning as a self-supervised adaptation problem that aims to transfer the task knowledge from a labeled source domain to a completely unpaired target. We propose to infer image-to-pose via two explicit mappings viz. image-to-latent and latent-to-pose where the latter is a pre-learned decoder obtained from a prior-enforcing generative adversarial auto-encoder. Next, we introduce relation distillation as a means to align the unpaired cross-modal samples i.e. the unpaired target videos and unpaired 3D pose sequences. To this end, we propose a new set of non-local relations in order to characterize long-range latent pose interactions unlike general contrastive relations where positive couplings are limited to a local neighborhood structure. Further, we provide an objective way to quantify non-localness in order to select the most effective relation set. We evaluate different self-adaptation settings and demonstrate state-of-the-art 3D human pose estimation performance on standard benchmarks.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-human-pose-estimation-on-3dpw | Non-Local Latent Relation Distillation | PA-MPJPE: 72.1 |
| unsupervised-3d-human-pose-estimation-on | Non-Local Latent Relation Distillation | MPJPE: 97.8 PA-MPJPE: 86.2 |
| weakly-supervised-3d-human-pose-estimation-on | Non-Local Latent Relation Distillation | Average MPJPE (mm): 57.6 PA-MPJPE: 48.2 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.