Command Palette
Search for a command to run...
Dennis Ludl; Thomas Gulde; Cristóbal Curio

Abstract
Recognizing human actions is a core challenge for autonomous systems as they directly share the same space with humans. Systems must be able to recognize and assess human actions in real-time. In order to train corresponding data-driven algorithms, a significant amount of annotated training data is required. We demonstrated a pipeline to detect humans, estimate their pose, track them over time and recognize their actions in real-time with standard monocular camera sensors. For action recognition, we encode the human pose into a new data format called Encoded Human Pose Image (EHPI) that can then be classified using standard methods from the computer vision community. With this simple procedure we achieve competitive state-of-the-art performance in pose-based action detection and can ensure real-time performance. In addition, we show a use case in the context of autonomous driving to demonstrate how such a system can be trained to recognize human actions using simulation data.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| skeleton-based-action-recognition-on-j-hmdb | EHPI | Accuracy (RGB+pose): - Accuracy (pose): 65.5 |
| skeleton-based-action-recognition-on-jhmdb-2d | EHPI | Average accuracy of 3 splits: 65.5 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.