HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation

Rawal Khirodkar Visesh Chari Amit Agrawal Ambrish Tyagi

Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation

Abstract

A key assumption of top-down human pose estimation approaches is their expectation of having a single person/instance present in the input bounding box. This often leads to failures in crowded scenes with occlusions. We propose a novel solution to overcome the limitations of this fundamental assumption. Our Multi-Instance Pose Network (MIPNet) allows for predicting multiple 2D pose instances within a given bounding box. We introduce a Multi-Instance Modulation Block (MIMB) that can adaptively modulate channel-wise feature responses for each instance and is parameter efficient. We demonstrate the efficacy of our approach by evaluating on COCO, CrowdPose, and OCHuman datasets. Specifically, we achieve 70.0 AP on CrowdPose and 42.5 AP on OCHuman test sets, a significant improvement of 2.4 AP and 6.5 AP over the prior art, respectively. When using ground truth bounding boxes for inference, MIPNet achieves an improvement of 0.7 AP on COCO, 0.9 AP on CrowdPose, and 9.1 AP on OCHuman validation sets compared to HRNet. Interestingly, when fewer, high confidence bounding boxes are used, HRNet's performance degrades (by 5 AP) on OCHuman, whereas MIPNet maintains a relatively stable performance (drop of 1 AP) for the same inputs.

Code Repositories

rawalkhirodkar/MIPNet
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
2d-human-pose-estimation-on-ochumanHRNet-W48
Test AP: 37.2
Validation AP: 37.8
2d-human-pose-estimation-on-ochumanMIPNet (HRNet-W48)
Test AP: 42.5
Validation AP: 42.0
keypoint-detection-on-cocoMIPNet(384x288)
Test AP: 75.7
Validation AP: 76.3
keypoint-detection-on-ochumanMIPNet (HRNet-W48)
Test AP: 42.5
Validation AP: 42.0
keypoint-detection-on-ochumanHRNet-W48
Test AP: 37.2
Validation AP: 37.8
multi-person-pose-estimation-on-crowdposeMIPNet (HRNet-W48)
AP Easy: 78.1
AP Hard: 59.4
AP Medium: 71.1
mAP @0.5:0.95: 70.0
multi-person-pose-estimation-on-ochumanMIPNet (gt-bb)
AP50: 89.7
AP75: 80.1
Validation AP: 74.1
pose-estimation-on-coco-test-devMIPNet
AP: 75.7
AP50: 92.4
AP75: 83.3
APL: 81.2
APM: 71.4
AR: 80.5
pose-estimation-on-crowdposeMIPNet (HRNet-W48)
AP: 70.0
AP Hard: 59.4
APM: 71.1
pose-estimation-on-ochumanMIPNet (HRNet-W48)
Test AP: 42.5
Validation AP: 42.0
pose-estimation-on-ochumanHRNet-W48
Test AP: 37.2
Validation AP: 37.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation | Papers | HyperAI