Command Palette
Search for a command to run...
Liu Jun Shahroudy Amir Xu Dong Wang Gang

Abstract
3D action recognition - analysis of human actions based on 3D skeleton data -becomes popular recently due to its succinctness, robustness, andview-invariant representation. Recent attempts on this problem suggested todevelop RNN-based learning methods to model the contextual dependency in thetemporal domain. In this paper, we extend this idea to spatio-temporal domainsto analyze the hidden sources of action-related information within the inputdata over both domains concurrently. Inspired by the graphical structure of thehuman skeleton, we further propose a more powerful tree-structure basedtraversal method. To handle the noise and occlusion in 3D skeleton data, weintroduce new gating mechanism within LSTM to learn the reliability of thesequential input data and accordingly adjust its effect on updating thelong-term context information stored in the memory cell. Our method achievesstate-of-the-art performance on 4 challenging benchmark datasets for 3D humanaction analysis.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| skeleton-based-action-recognition-on-ntu-rgbd | Spatio-Temporal LSTM | Accuracy (CS): 69.2 Accuracy (CV): 77.7 |
| skeleton-based-action-recognition-on-ntu-rgbd | ST-LSTM | Accuracy (CS): 61.70 Accuracy (CV): 75.50 |
| skeleton-based-action-recognition-on-ntu-rgbd-1 | Spatio-Temporal LSTM | Accuracy (Cross-Setup): 57.9% Accuracy (Cross-Subject): 55.7% |
| skeleton-based-action-recognition-on-sbu | ST-LSTM + Trust Gate | Accuracy: 93.3% |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.