HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer

Lin Jing ; Zeng Ailing ; Wang Haoqian ; Zhang Lei ; Li Yu

One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer

Abstract

Whole-body mesh recovery aims to estimate the 3D human body, face, and handsparameters from a single image. It is challenging to perform this task with asingle network due to resolution issues, i.e., the face and hands are usuallylocated in extremely small regions. Existing works usually detect hands andfaces, enlarge their resolution to feed in a specific network to predict theparameter, and finally fuse the results. While this copy-paste pipeline cancapture the fine-grained details of the face and hands, the connections betweendifferent parts cannot be easily recovered in late fusion, leading toimplausible 3D rotation and unnatural pose. In this work, we propose aone-stage pipeline for expressive whole-body mesh recovery, named OSX, withoutseparate networks for each part. Specifically, we design a Component AwareTransformer (CAT) composed of a global body encoder and a local face/handdecoder. The encoder predicts the body parameters and provides a high-qualityfeature map for the decoder, which performs a feature-level upsample-cropscheme to extract high-resolution part-specific features and adoptkeypoint-guided deformable attention to estimate hand and face precisely. Thewhole pipeline is simple yet effective without any manual post-processing andnaturally avoids implausible prediction. Comprehensive experiments demonstratethe effectiveness of OSX. Lastly, we build a large-scale Upper-Body dataset(UBody) with high-quality 2D and 3D whole-body annotations. It contains personswith partially visible bodies in diverse real-life scenarios to bridge the gapbetween the basic task and downstream applications.

Code Repositories

IDEA-Research/OSX
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-human-pose-estimation-on-3dpwOSX
MPJPE: 74.7
PA-MPJPE: 45.1
3d-human-pose-estimation-on-ubodyOSX
PA-PVE-All: 42.2
PA-PVE-Face: 2.0
PA-PVE-Hands: 8.6
PVE-All: 81.9
PVE-Face: 21.2
PVE-Hands: 41.5
3d-human-reconstruction-on-ehfOSX
MPVPE: 70.8
PA V2V (mm), face: 6
PA V2V (mm), whole body: 48.7
3d-multi-person-mesh-recovery-on-agoraOSX
B-NMVE: 85.3
F-MVE: 36.2
FB-MVE: 122.8
FB-NMVE: 130.6
LH/RH-MVE: 45.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer | Papers | HyperAI