HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation

Zhao Qitao ; Zheng Ce ; Liu Mengyuan ; Wang Pichao ; Chen Chen

PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D
  Human Pose Estimation

Abstract

Recently, transformer-based methods have gained significant success insequential 2D-to-3D lifting human pose estimation. As a pioneering work,PoseFormer captures spatial relations of human joints in each video frame andhuman dynamics across frames with cascaded transformer layers and has achievedimpressive performance. However, in real scenarios, the performance ofPoseFormer and its follow-ups is limited by two factors: (a) The length of theinput joint sequence; (b) The quality of 2D joint detection. Existing methodstypically apply self-attention to all frames of the input sequence, causing ahuge computational burden when the frame number is increased to obtain advancedestimation accuracy, and they are not robust to noise naturally brought by thelimited capability of 2D joint detectors. In this paper, we proposePoseFormerV2, which exploits a compact representation of lengthy skeletonsequences in the frequency domain to efficiently scale up the receptive fieldand boost robustness to noisy 2D joint detection. With minimum modifications toPoseFormer, the proposed method effectively fuses features both in the timedomain and frequency domain, enjoying a better speed-accuracy trade-off thanits precursor. Extensive experiments on two benchmark datasets (i.e., Human3.6Mand MPI-INF-3DHP) demonstrate that the proposed approach significantlyoutperforms the original PoseFormer and other transformer-based variants. Codeis released at \url{https://github.com/QitaoZhao/PoseFormerV2}.

Code Repositories

zczcwh/DL-HPE
Mentioned in GitHub
qitaozhao/poseformerv2
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-human-pose-estimation-on-human36mPoseFormerV2 (f=27, T=243)
Average MPJPE (mm): 45.2
3d-human-pose-estimation-on-mpi-inf-3dhpPoseFormerV2 (T=81)
AUC: 78.8
MPJPE: 27.8
PCK: 97.9
classification-on-full-body-parkinsonsPoseFormerV2
F1-score (weighted): 0.59

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation | Papers | HyperAI