HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection

Mohammadreza Zolfaghari; Gabriel L. Oliveira; Nima Sedaghat; Thomas Brox

Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection

Abstract

General human action recognition requires understanding of various visual cues. In this paper, we propose a network architecture that computes and integrates the most important visual cues for action recognition: pose, motion, and the raw images. For the integration, we introduce a Markov chain model which adds cues successively. The resulting approach is efficient and applicable to action classification as well as to spatial and temporal action localization. The two contributions clearly improve the performance over respective baselines. The overall approach achieves state-of-the-art action classification performance on HMDB51, J-HMDB and NTU RGB+D datasets. Moreover, it yields state-of-the-art spatio-temporal action localization results on UCF101 and J-HMDB.

Benchmarks

BenchmarkMethodologyMetrics
skeleton-based-action-recognition-on-j-hmdbChained (RGB+Flow +Pose)
Accuracy (RGB+pose): 76.1
Accuracy (pose): 56.8
skeleton-based-action-recognition-on-jhmdb-2dChained
Average accuracy of 3 splits: 56.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection | Papers | HyperAI