Command Palette
Search for a command to run...
Amir Shahroudy; Jun Liu; Tian-Tsong Ng; Gang Wang

Abstract
Recent approaches in depth-based human activity analysis achieved outstanding performance and proved the effectiveness of 3D representation for classification of action classes. Currently available depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of training samples, distinct class labels, camera views and variety of subjects. In this paper we introduce a large-scale dataset for RGB+D human action recognition with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects. Our dataset contains 60 different action classes including daily, mutual, and health-related actions. In addition, we propose a new recurrent neural network structure to model the long-term temporal correlation of the features for each body part, and utilize them for better action classification. Experimental results show the advantages of applying deep learning methods over state-of-the-art hand-crafted features on the suggested cross-subject and cross-view evaluation criteria for our dataset. The introduction of this large scale dataset will enable the community to apply, develop and adapt various data-hungry learning techniques for the task of depth-based and RGB+D-based human activity analysis.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| skeleton-based-action-recognition-on-cad-120 | P-LSTM (5-shot) | Accuracy: 68.1% |
| skeleton-based-action-recognition-on-ntu-rgbd | Part-aware LSTM | Accuracy (CS): 62.93 Accuracy (CV): 70.27 |
| skeleton-based-action-recognition-on-ntu-rgbd | Deep LSTM | Accuracy (CS): 60.7 Accuracy (CV): 67.3 |
| skeleton-based-action-recognition-on-ntu-rgbd-1 | Part-Aware LSTM | Accuracy (Cross-Setup): 26.3% Accuracy (Cross-Subject): 25.5% |
| skeleton-based-action-recognition-on-varying | P-LSTM | Accuracy (AV I): 33% Accuracy (AV II): 50% Accuracy (CS): 60% Accuracy (CV I): 13% Accuracy (CV II): 33% |
| skeleton-based-action-recognition-on-varying | LSTM | Accuracy (AV I): 31% Accuracy (AV II): 68% Accuracy (CS): 56% Accuracy (CV I): 16% Accuracy (CV II): 31% |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.