Command Palette
Search for a command to run...
Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle
Purkrabek Miroslav ; Matas Jiri

Abstract
Human pose estimation methods work well on isolated people but struggle withmultiple-bodies-in-proximity scenarios. Previous work has addressed thisproblem by conditioning pose estimation by detected bounding boxes orkeypoints, but overlooked instance masks. We propose to iteratively enforcemutual consistency of bounding boxes, instance masks, and poses. The introducedBBox-Mask-Pose (BMP) method uses three specialized models that improve eachother's output in a closed loop. All models are adapted for mutualconditioning, which improves robustness in multi-body scenes. MaskPose, a newmask-conditioned pose estimation model, is the best among top-down approacheson OCHuman. BBox-Mask-Pose pushes SOTA on OCHuman dataset in all three tasks -detection, instance segmentation, and pose estimation. It also achieves SOTAperformance on COCO pose estimation. The method is especially good in sceneswith large instances overlap, where it improves detection by 39% over thebaseline detector. With small specialized models and faster runtime, BMP is aneffective alternative to large human-centered foundational models. Code andmodels are available on https://MiraPurkrabek.github.io/BBox-Mask-Pose.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 2d-human-pose-estimation-on-ochuman | BBox-Mask-Pose 2x | Test AP: 48.3 Validation AP: 48.6 |
| human-instance-segmentation-on-ochuman | RTMDet-ins-l | AP: 26.5 |
| human-instance-segmentation-on-ochuman | BBox-Mask-Pose 2x | AP: 32.4 |
| keypoint-detection-on-ochuman | BBox-Mask-Pose 2x | Test AP: 48.3 Validation AP: 48.6 |
| pose-estimation-on-ochuman | BBox-Mask-Pose 2x | Test AP: 48.3 Validation AP: 48.6 |
| pose-estimation-on-ochuman | MaskPose-b | Test AP: 45.0 Validation AP: 45.3 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.