Command Palette
Search for a command to run...
MaskHand: Generative Masked Modeling for Robust Hand Mesh Reconstruction in the Wild
Saleem Muhammad Usama ; Pinyoanuntapong Ekkasit ; Patel Mayur Jagdishbhai ; Xue Hongfei ; Helmy Ahmed ; Das Srijan ; Wang Pu

Abstract
Reconstructing a 3D hand mesh from a single RGB image is challenging due tocomplex articulations, self-occlusions, and depth ambiguities. Traditionaldiscriminative methods, which learn a deterministic mapping from a 2D image toa single 3D mesh, often struggle with the inherent ambiguities in 2D-to-3Dmapping. To address this challenge, we propose MaskHand, a novel generativemasked model for hand mesh recovery that synthesizes plausible 3D hand meshesby learning and sampling from the probabilistic distribution of the ambiguous2D-to-3D mapping process. MaskHand consists of two key components: (1) aVQ-MANO, which encodes 3D hand articulations as discrete pose tokens in alatent space, and (2) a Context-Guided Masked Transformer that randomly masksout pose tokens and learns their joint distribution, conditioned on corruptedtoken sequence, image context, and 2D pose cues. This learned distributionfacilitates confidence-guided sampling during inference, producing meshreconstructions with low uncertainty and high precision. Extensive evaluationson benchmark and real-world datasets demonstrate that MaskHand achievesstate-of-the-art accuracy, robustness, and realism in 3D hand meshreconstruction. Project website:https://m-usamasaleem.github.io/publication/MaskHand/MaskHand.html.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-hand-pose-estimation-on-dexycb | MaskHand | Average MPJPE (mm): 11.7 MPVPE: 11.2 PA-MPVPE: 4.9 Procrustes-Aligned MPJPE: 5.0 |
| 3d-hand-pose-estimation-on-freihand | MaskHand | PA-F@15mm: 0.991 PA-F@5mm: 0.801 PA-MPJPE: 5.5 PA-MPVPE: 5.4 |
| 3d-hand-pose-estimation-on-hint-hand | MaskHand | PCK@0.05 (Ego4D) All: 46.4 PCK@0.05 (Ego4D) Occ: 29.4 PCK@0.05 (Ego4D) Visible: 59.3 PCK@0.05 (New Days) All: 48.7 PCK@0.05 (NewDays) Occ: 29.4 PCK@0.05 (NewDays) Visible: 61.0 PCK@0.05 (VISOR) All: 46.1 PCK@0.05 (VISOR) Occ: 31.4 PCK@0.05 (VISOR) Visible: 62.1 |
| 3d-hand-pose-estimation-on-ho-3d-v3 | MaskHand | AUC_J: 0.860 AUC_V: 0.860 F@15mm: 0.984 F@5mm: 0.663 PA-MPJPE: 7.0 PA-MPVPE: 7.0 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.