HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation

Peng Jihua ; Zhou Yanghong ; Mok P. Y.

KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced
  Transformer for 3D Human Pose Estimation

Abstract

This paper presents a novel Kinematics and Trajectory PriorKnowledge-Enhanced Transformer (KTPFormer), which overcomes the weakness inexisting transformer-based methods for 3D human pose estimation that thederivation of Q, K, V vectors in their self-attention mechanisms are all basedon simple linear mapping. We propose two prior attention modules, namelyKinematics Prior Attention (KPA) and Trajectory Prior Attention (TPA) to takeadvantage of the known anatomical structure of the human body and motiontrajectory information, to facilitate effective learning of global dependenciesand features in the multi-head self-attention. KPA models kinematicrelationships in the human body by constructing a topology of kinematics, whileTPA builds a trajectory topology to learn the information of joint motiontrajectory across frames. Yielding Q, K, V vectors with prior knowledge, thetwo modules enable KTPFormer to model both spatial and temporal correlationssimultaneously. Extensive experiments on three benchmarks (Human3.6M,MPI-INF-3DHP and HumanEva) show that KTPFormer achieves superior performance incomparison to state-of-the-art methods. More importantly, our KPA and TPAmodules have lightweight plug-and-play designs and can be integrated intovarious transformer-based networks (i.e., diffusion-based) to improve theperformance with only a very small increase in the computational overhead. Thecode is available at: https://github.com/JihuaPeng/KTPFormer.

Code Repositories

JihuaPeng/KTPFormer
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
3d-human-pose-estimation-on-human36mKTPFormer (T=243)
Average MPJPE (mm): 33.0
Multi-View or Monocular: Monocular
PA-MPJPE: 26.2
Using 2D ground-truth joints: No
3d-human-pose-estimation-on-human36mKTPFormer
Average MPJPE (mm): 18.1
Multi-View or Monocular: Monocular
Using 2D ground-truth joints: Yes
3d-human-pose-estimation-on-mpi-inf-3dhpKTPFormer
AUC: 85.9
MPJPE: 16.7
PCK: 98.9
monocular-3d-human-pose-estimation-on-human3KTPFormer
2D detector: CPN
Average MPJPE (mm): 40.1
Frames Needed: 243
Need Ground Truth 2D Pose: No
Use Video Sequence: Yes

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation | Papers | HyperAI