a month ago

Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition

Liu Jun Shahroudy Amir Xu Dong Wang Gang

Abstract

3D action recognition - analysis of human actions based on 3D skeleton data -becomes popular recently due to its succinctness, robustness, andview-invariant representation. Recent attempts on this problem suggested todevelop RNN-based learning methods to model the contextual dependency in thetemporal domain. In this paper, we extend this idea to spatio-temporal domainsto analyze the hidden sources of action-related information within the inputdata over both domains concurrently. Inspired by the graphical structure of thehuman skeleton, we further propose a more powerful tree-structure basedtraversal method. To handle the noise and occlusion in 3D skeleton data, weintroduce new gating mechanism within LSTM to learn the reliability of thesequential input data and accordingly adjust its effect on updating thelong-term context information stored in the memory cell. Our method achievesstate-of-the-art performance on 4 challenging benchmark datasets for 3D humanaction analysis.

Benchmarks

Benchmark	Methodology	Metrics
skeleton-based-action-recognition-on-ntu-rgbd	Spatio-Temporal LSTM	Accuracy (CS): 69.2 Accuracy (CV): 77.7
skeleton-based-action-recognition-on-ntu-rgbd	ST-LSTM	Accuracy (CS): 61.70 Accuracy (CV): 75.50
skeleton-based-action-recognition-on-ntu-rgbd-1	Spatio-Temporal LSTM	Accuracy (Cross-Setup): 57.9% Accuracy (Cross-Subject): 55.7%
skeleton-based-action-recognition-on-sbu	ST-LSTM + Trust Gate	Accuracy: 93.3%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning