HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

X-Pose: Detecting Any Keypoints

Yang Jie ; Zeng Ailing ; Zhang Ruimao ; Zhang Lei

X-Pose: Detecting Any Keypoints

Abstract

This work aims to address an advanced keypoint detection problem: how toaccurately detect any keypoints in complex real-world scenarios, which involvesmassive, messy, and open-ended objects as well as their associated keypointsdefinitions. Current high-performance keypoint detectors often fail to tacklethis problem due to their two-stage schemes, under-explored prompt designs, andlimited training data. To bridge the gap, we propose X-Pose, a novel end-to-endframework with multi-modal (i.e., visual, textual, or their combinations)prompts to detect multi-object keypoints for any articulated (e.g., human andanimal), rigid, and soft objects within a given image. Moreover, we introduce alarge-scale dataset called UniKPT, which unifies 13 keypoint detection datasetswith 338 keypoints across 1,237 categories over 400K instances. Training withUniKPT, X-Pose effectively aligns text-to-keypoint and image-to-keypoint due tothe mutual enhancement of multi-modal prompts based on cross-modalitycontrastive learning. Our experimental results demonstrate that X-Pose achievesnotable improvements of 27.7 AP, 6.44 PCK, and 7.0 AP compared tostate-of-the-art non-promptable, visual prompt-based, and textual prompt-basedmethods in each respective fair setting. More importantly, the in-the-wild testdemonstrates X-Pose's strong fine-grained keypoint localization andgeneralization abilities across image styles, object categories, and poses,paving a new path to multi-object keypoint detection in real applications. Ourcode and dataset are available at https://github.com/IDEA-Research/X-Pose.

Code Repositories

idea-research/x-pose
Official
pytorch
Mentioned in GitHub
IDEA-Research/UniPose
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
2d-human-pose-estimation-on-human-artUniPose
AP: 0.759
animal-pose-estimation-on-ap-10kUniPose
AP: 79.2
multi-person-pose-estimation-on-cocoUniPose
AP: 0.768

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
X-Pose: Detecting Any Keypoints | Papers | HyperAI