HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Spatio-temporal Relation Modeling for Few-shot Action Recognition

Anirudh Thatipelli Sanath Narayan Salman Khan Rao Muhammad Anwer Fahad Shahbaz Khan Bernard Ghanem

Spatio-temporal Relation Modeling for Few-shot Action Recognition

Abstract

We propose a novel few-shot action recognition framework, STRM, which enhances class-specific feature discriminability while simultaneously learning higher-order temporal representations. The focus of our approach is a novel spatio-temporal enrichment module that aggregates spatial and temporal contexts with dedicated local patch-level and global frame-level feature enrichment sub-modules. Local patch-level enrichment captures the appearance-based characteristics of actions. On the other hand, global frame-level enrichment explicitly encodes the broad temporal context, thereby capturing the relevant object features over time. The resulting spatio-temporally enriched representations are then utilized to learn the relational matching between query and support action sub-sequences. We further introduce a query-class similarity classifier on the patch-level enriched features to enhance class-specific feature discriminability by reinforcing the feature learning at different stages in the proposed framework. Experiments are performed on four few-shot action recognition benchmarks: Kinetics, SSv2, HMDB51 and UCF101. Our extensive ablation study reveals the benefits of the proposed contributions. Furthermore, our approach sets a new state-of-the-art on all four benchmarks. On the challenging SSv2 benchmark, our approach achieves an absolute gain of $3.5\%$ in classification accuracy, as compared to the best existing method in the literature. Our code and models are available at https://github.com/Anirudh257/strm.

Code Repositories

Anirudh257/strm
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
few-shot-action-recognition-on-hmdb51STRM
1:1 Accuracy: 77.3
few-shot-action-recognition-on-kinetics-100STRM
Accuracy: 86.7
few-shot-action-recognition-on-somethingSTRM
1:1 Accuracy: 68.1
few-shot-action-recognition-on-ucf101STRM
1:1 Accuracy: 96.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Spatio-temporal Relation Modeling for Few-shot Action Recognition | Papers | HyperAI