HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Temporal-Relational CrossTransformers for Few-Shot Action Recognition

Toby Perrett Alessandro Masullo Tilo Burghardt Majid Mirmehdi Dima Damen

Temporal-Relational CrossTransformers for Few-Shot Action Recognition

Abstract

We propose a novel approach to few-shot action recognition, finding temporally-corresponding frame tuples between the query and videos in the support set. Distinct from previous few-shot works, we construct class prototypes using the CrossTransformer attention mechanism to observe relevant sub-sequences of all support videos, rather than using class averages or single best matches. Video representations are formed from ordered tuples of varying numbers of frames, which allows sub-sequences of actions at different speeds and temporal offsets to be compared. Our proposed Temporal-Relational CrossTransformers (TRX) achieve state-of-the-art results on few-shot splits of Kinetics, Something-Something V2 (SSv2), HMDB51 and UCF101. Importantly, our method outperforms prior work on SSv2 by a wide margin (12%) due to the its ability to model temporal relations. A detailed ablation showcases the importance of matching to multiple support set videos and learning higher-order relational CrossTransformers.

Code Repositories

tobyperrett/trx
Official
pytorch
Mentioned in GitHub
tobyperrett/few-shot-action-recognition
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
few-shot-action-recognition-on-hmdb51TRX
1:1 Accuracy: 75.6
few-shot-action-recognition-on-kinetics-100TRX
Accuracy: 85.9
few-shot-action-recognition-on-somethingTRX
1:1 Accuracy: 64.6
few-shot-action-recognition-on-ucf101TRX
1:1 Accuracy: 96.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Temporal-Relational CrossTransformers for Few-Shot Action Recognition | Papers | HyperAI