HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video

Antonino Furnari Giovanni Maria Farinella

Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video

Abstract

In this paper, we tackle the problem of egocentric action anticipation, i.e., predicting what actions the camera wearer will perform in the near future and which objects they will interact with. Specifically, we contribute Rolling-Unrolling LSTM, a learning architecture to anticipate actions from egocentric videos. The method is based on three components: 1) an architecture comprised of two LSTMs to model the sub-tasks of summarizing the past and inferring the future, 2) a Sequence Completion Pre-Training technique which encourages the LSTMs to focus on the different sub-tasks, and 3) a Modality ATTention (MATT) mechanism to efficiently fuse multi-modal predictions performed by processing RGB frames, optical flow fields and object-based features. The proposed approach is validated on EPIC-Kitchens, EGTEA Gaze+ and ActivityNet. The experiments show that the proposed architecture is state-of-the-art in the domain of egocentric videos, achieving top performances in the 2019 EPIC-Kitchens egocentric action anticipation challenge. The approach also achieves competitive performance on ActivityNet with respect to methods not based on unsupervised pre-training and generalizes to the tasks of early action recognition and action recognition. To encourage research on this challenging topic, we made our code, trained models, and pre-extracted features available at our web page: http://iplab.dmi.unict.it/rulstm.

Code Repositories

antoninofurnari/rulstm
pytorch
Mentioned in GitHub
fpv-iplab/rulstm
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
action-anticipation-on-epic-kitchens-100-testRULSTM
recall@5: 11.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video | Papers | HyperAI