HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

Liu Ziyu ; Zhang Hongwen ; Chen Zhenghao ; Wang Zhiyong ; Ouyang Wanli

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action
  Recognition

Abstract

Spatial-temporal graphs have been widely used by skeleton-based actionrecognition algorithms to model human action dynamics. To capture robustmovement patterns from these graphs, long-range and multi-scale contextaggregation and spatial-temporal dependency modeling are critical aspects of apowerful feature extractor. However, existing methods have limitations inachieving (1) unbiased long-range joint relationship modeling under multi-scaleoperators and (2) unobstructed cross-spacetime information flow for capturingcomplex spatial-temporal dependencies. In this work, we present (1) a simplemethod to disentangle multi-scale graph convolutions and (2) a unifiedspatial-temporal graph convolutional operator named G3D. The proposedmulti-scale aggregation scheme disentangles the importance of nodes indifferent neighborhoods for effective long-range modeling. The proposed G3Dmodule leverages dense cross-spacetime edges as skip connections for directinformation propagation across the spatial-temporal graph. By coupling theseproposals, we develop a powerful feature extractor named MS-G3D based on whichour model outperforms previous state-of-the-art methods on three large-scaledatasets: NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400.

Code Repositories

kenziyuliu/ms-g3d
Official
pytorch
Mentioned in GitHub
kennymckormick/pyskl
pytorch
Mentioned in GitHub
metrics-lab/st-fmri
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-action-recognition-on-assembly101MS-G3D
Actions Top-1: 28.7
Object Top-1: 36.3
Verbs Top-1: 65.7
action-recognition-on-h2o-2-hands-and-objectsMS-G3D
Actions Top-1: 50.83
Hand Pose: 3D
Object Label: No
Object Pose: No
RGB: No
skeleton-based-action-recognition-on-kineticsMS-G3D
Accuracy: 38.0
skeleton-based-action-recognition-on-ntu-rgbdMS-G3D Net
Accuracy (CS): 91.5
Accuracy (CV): 96.2
skeleton-based-action-recognition-on-ntu-rgbd-1MS-G3D Net
Accuracy (Cross-Setup): 88.4%
Accuracy (Cross-Subject): 86.9%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition | Papers | HyperAI