Command Palette
Search for a command to run...

Abstract
Expressive human pose and shape estimation (EHPS) unifies body, hands, andface motion capture with numerous applications. Despite encouraging progress,current state-of-the-art methods still depend largely on a confined set oftraining datasets. In this work, we investigate scaling up EHPS towards thefirst generalist foundation model (dubbed SMPLer-X), with up to ViT-Huge as thebackbone and training with up to 4.5M instances from diverse data sources. Withbig data and the large model, SMPLer-X exhibits strong performance acrossdiverse test benchmarks and excellent transferability to even unseenenvironments. 1) For the data scaling, we perform a systematic investigation on32 EHPS datasets, including a wide range of scenarios that a model trained onany single dataset cannot handle. More importantly, capitalizing on insightsobtained from the extensive benchmarking process, we optimize our trainingscheme and select datasets that lead to a significant leap in EHPScapabilities. 2) For the model scaling, we take advantage of visiontransformers to study the scaling law of model sizes in EHPS. Moreover, ourfinetuning strategy turn SMPLer-X into specialist models, allowing them toachieve further performance boosts. Notably, our foundation model SMPLer-Xconsistently delivers state-of-the-art results on seven benchmarks such asAGORA (107.2 mm NMVE), UBody (57.4 mm PVE), EgoBody (63.6 mm PVE), and EHF(62.3 mm PVE without finetuning). Homepage:https://caizhongang.github.io/projects/SMPLer-X/
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-human-pose-estimation-on-3dpw | SMPLer-X | MPJPE: 75.2 |
| 3d-human-pose-estimation-on-ubody | SMPLer-X | PA-PVE-All: 31.9 PA-PVE-Face: 2.8 PA-PVE-Hands: 10.3 PVE-All: 57.5 PVE-Face: 21.6 PVE-Hands: 40.2 |
| 3d-human-reconstruction-on-ehf | SMPLer-X | MPVPE: 62.4 PA V2V (mm), whole body: 37.1 |
| 3d-multi-person-mesh-recovery-on-agora | SMPLer-X | B-NMVE: 68.3 F-MVE: 29.9 FB-MVE: 99.7 FB-NMVE: 107.2 LH/RH-MVE: 39.3 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.