Command Palette
Search for a command to run...
You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception
Jin Sheng ; Li Shuhuai ; Li Tong ; Liu Wentao ; Qian Chen ; Luo Ping

Abstract
Human-centric perception (e.g. detection, segmentation, pose estimation, andattribute analysis) is a long-standing problem for computer vision. This paperintroduces a unified and versatile framework (HQNet) for single-stagemulti-person multi-task human-centric perception (HCP). Our approach centers onlearning a unified human query representation, denoted as Human Query, whichcaptures intricate instance-level features for individual persons anddisentangles complex multi-person scenarios. Although different HCP tasks havebeen well-studied individually, single-stage multi-task learning of HCP taskshas not been fully exploited in the literature due to the absence of acomprehensive benchmark dataset. To address this gap, we propose COCO-UniHumanbenchmark to enable model development and comprehensive evaluation.Experimental results demonstrate the proposed method's state-of-the-artperformance among multi-task HCP models and its competitive performancecompared to task-specific HCP models. Moreover, our experiments underscoreHuman Query's adaptability to new HCP tasks, thus demonstrating its robustgeneralization capability. Codes and data are available athttps://github.com/lishuhuai527/COCO-UniHuman.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| human-instance-segmentation-on-ochuman | HQNet (ResNet-50) | AP: 31.1 |
| human-instance-segmentation-on-ochuman | HQNet (ViT-L) | AP: 38.8 |
| pose-estimation-on-ochuman | HQNet (ResNet-50) | Test AP: 40.0 |
| pose-estimation-on-ochuman | HQNet (ViT-L) | Test AP: 45.6 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.