HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Rethinking pose estimation in crowds: overcoming the detection information-bottleneck and ambiguity

Zhou Mu ; Stoffl Lucas ; Mathis Mackenzie Weygandt ; Mathis Alexander

Rethinking pose estimation in crowds: overcoming the detection
  information-bottleneck and ambiguity

Abstract

Frequent interactions between individuals are a fundamental challenge forpose estimation algorithms. Current pipelines either use an object detectortogether with a pose estimator (top-down approach), or localize all body partsfirst and then link them to predict the pose of individuals (bottom-up). Yet,when individuals closely interact, top-down methods are ill-defined due tooverlapping individuals, and bottom-up methods often falsely infer connectionsto distant bodyparts. Thus, we propose a novel pipeline called bottom-upconditioned top-down pose estimation (BUCTD) that combines the strengths ofbottom-up and top-down methods. Specifically, we propose to use a bottom-upmodel as the detector, which in addition to an estimated bounding box providesa pose proposal that is fed as condition to an attention-based top-down model.We demonstrate the performance and efficiency of our approach on animal andhuman pose estimation benchmarks. On CrowdPose and OCHuman, we outperformprevious state-of-the-art models by a significant margin. We achieve 78.5 AP onCrowdPose and 48.5 AP on OCHuman, an improvement of 8.6% and 7.8% over theprior art, respectively. Furthermore, we show that our method strongly improvesthe performance on multi-animal benchmarks involving fish and monkeys. The codeis available at https://github.com/amathislab/BUCTD

Code Repositories

amathislab/BUCTD
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
animal-pose-estimation-on-fish-100HRNet-W48 + Faster R-CNN
mAP: 89.1
animal-pose-estimation-on-fish-100BUCTD-preNet-W48 (DLCRNet)
mAP: 88.7
animal-pose-estimation-on-fish-100BUCTD-preNet-W48 (CID-W32)
mAP: 88.0
animal-pose-estimation-on-marmoset-8kBUCTD-preNet-W48 (CID-W32)
mAP: 93.3
animal-pose-estimation-on-marmoset-8kBUCTD-CoAM-W48 (DLCRNet)
mAP: 91.6
animal-pose-estimation-on-marmoset-8kCID-W32
mAP: 92.5
animal-pose-estimation-on-trimouse-161BUCTD-CoAM-W48 (DLCRNet)
mAP: 99.1
animal-pose-estimation-on-trimouse-161DLCRNet
mAP: 95.8
animal-pose-estimation-on-trimouse-161CID-W32
mAP: 86.8
multi-person-pose-estimation-on-crowdposeBUCTD-W48 (w/cond. input from PETR, and generative sampling)
AP Easy: 83.9
AP Hard: 72.3
AP Medium: 79.0
mAP @0.5:0.95: 78.5
pose-estimation-on-cocoBUCTD (PETR, with generative sampling)
APL: 83.7
APM: 74.2
pose-estimation-on-cocoBUCTD (PETR, with generative sampling)
AP: 77.8
pose-estimation-on-crowdposeBUCTD-W48
AP: 72.9
pose-estimation-on-crowdposeBUCTD-W48 (w/cond. input from PETR)
AP: 76.7
pose-estimation-on-crowdposeBUCTD-W48 (w/cond. input from PETR, and generative sampling)
AP: 78.5
AP Easy: 83.9
AP Hard: 72.3
AP Medium: 79.0
pose-estimation-on-ochumanBUCTD (CID-W32)
Test AP: 47.2
Validation AP: 47.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Rethinking pose estimation in crowds: overcoming the detection information-bottleneck and ambiguity | Papers | HyperAI