HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Multi-scale Motion-Aware Module for Video Action Recognition

{Yu-Chee Tseng Huai-Wei Peng}

Abstract

Due to the lengthy computing time for optical flow, recentworks have proposed to use the correlation operation as an alternative approach to extracting motion features. Although using correlation operations shows significant improvement with negligible FLOPs,it introduces much more latency per FLOP than convolution operations and increases noticeable latency as a larger searching patch isapplied. Nonetheless, shrinking the searching patch in correlation operation is doomed to degrade its performance owing to the inability tocapture larger displacements. In this paper, we propose an effective andlow-latency Multi-Scale Motion-Aware (MSMA) module. It uses smallersearching patches at different scales for efficiently extracting motion features from large displacements. It can be installed into and generalizeswell on different CNN backbones. When installed into TSM ResNet-50,the MSMA module introduces ≈ 17.6% more latency on NVIDIA TeslaV100 GPU, yet, it achieves state-of-the-art performance on SomethingSomething V1 & V2 and Diving-48.

Benchmarks

BenchmarkMethodologyMetrics
action-recognition-in-videos-on-somethingMSMA (8+16frames)
Top-1 Accuracy: 68.2
action-recognition-in-videos-on-something-1MSMA (8+16frames)
Top 1 Accuracy: 57.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Multi-scale Motion-Aware Module for Video Action Recognition | Papers | HyperAI