HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

AM Flow: Adapters for Temporal Processing in Action Recognition

Agrawal Tanay ; Ali Abid ; Dantcheva Antitza ; Bremond Francois

AM Flow: Adapters for Temporal Processing in Action Recognition

Abstract

Deep learning models, in particular \textit{image} models, have recentlygained generalisability and robustness. %are becoming more general and robustby the day. In this work, we propose to exploit such advances in the realm of\textit{video} classification. Video foundation models suffer from therequirement of extensive pretraining and a large training time. Towardsmitigating such limitations, we propose "\textit{Attention Map (AM) Flow}" forimage models, a method for identifying pixels relevant to motion in each inputvideo frame. In this context, we propose two methods to compute AM flow,depending on camera motion. AM flow allows the separation of spatial andtemporal processing, while providing improved results over combinedspatio-temporal processing (as in video models). Adapters, one of the populartechniques in parameter efficient transfer learning, facilitate theincorporation of AM flow into pretrained image models, mitigating the need forfull-finetuning. We extend adapters to "\textit{temporal processing adapters}"by incorporating a temporal processing unit into the adapters. Our workachieves faster convergence, therefore reducing the number of epochs needed fortraining. Moreover, we endow an image model with the ability to achievestate-of-the-art results on popular action recognition datasets. This reducestraining time and simplifies pretraining. We present experiments onKinetics-400, Something-Something v2, and Toyota Smarthome datasets, showcasingstate-of-the-art or comparable results.

Benchmarks

BenchmarkMethodologyMetrics
action-classification-on-kinetics-400AM/12 ViT-B Dinov2
Acc@1: 89.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
AM Flow: Adapters for Temporal Processing in Action Recognition | Papers | HyperAI