Command Palette
Search for a command to run...
Rawal Khirodkar Timur Bagautdinov Julieta Martinez Su Zhaoen Austin James Peter Selednik Stuart Anderson Shunsuke Saito

Abstract
We present Sapiens, a family of models for four fundamental human-centricvision tasks - 2D pose estimation, body-part segmentation, depth estimation,and surface normal prediction. Our models natively support 1K high-resolutioninference and are extremely easy to adapt for individual tasks by simplyfine-tuning models pretrained on over 300 million in-the-wild human images. Weobserve that, given the same computational budget, self-supervised pretrainingon a curated dataset of human images significantly boosts the performance for adiverse set of human-centric tasks. The resulting models exhibit remarkablegeneralization to in-the-wild data, even when labeled data is scarce orentirely synthetic. Our simple model design also brings scalability - modelperformance across tasks improves as we scale the number of parameters from 0.3to 2 billion. Sapiens consistently surpasses existing baselines across varioushuman-centric benchmarks. We achieve significant improvements over the priorstate-of-the-art on Humans-5K (pose) by 7.6 mAP, Humans-2K (part-seg) by 17.1mIoU, Hi4D (depth) by 22.4% relative RMSE, and THuman2 (normal) by 53.5%relative angular error.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 2d-human-pose-estimation-on-coco-wholebody-1 | Sapiens-2B | WB: 74.4 body: 79.2 face: 91.2 foot: 84.1 hand: 70.4 |
| 2d-human-pose-estimation-on-coco-wholebody-1 | Sapiens-0.3B | WB: 62.0 body: 66.4 face: 87.1 foot: 67.3 hand: 58.1 |
| keypoint-detection-on-coco | Sapiens-1B | Validation AP: 82.1 |
| keypoint-detection-on-coco | Sapiens-2B | Validation AP: 82.2 |
| keypoint-detection-on-coco | Sapiens-0.3B | Validation AP: 79.6 |
| keypoint-detection-on-coco | Sapiens-0.6B | Validation AP: 81.2 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.