HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Interaction Region Visual Transformer for Egocentric Action Anticipation

Debaditya Roy Ramanathan Rajendiran Basura Fernando

Interaction Region Visual Transformer for Egocentric Action Anticipation

Abstract

Human-object interaction is one of the most important visual cues and we propose a novel way to represent human-object interactions for egocentric action anticipation. We propose a novel transformer variant to model interactions by computing the change in the appearance of objects and human hands due to the execution of the actions and use those changes to refine the video representation. Specifically, we model interactions between hands and objects using Spatial Cross-Attention (SCA) and further infuse contextual information using Trajectory Cross-Attention to obtain environment-refined interaction tokens. Using these tokens, we construct an interaction-centric video representation for action anticipation. We term our model InAViT which achieves state-of-the-art action anticipation performance on large-scale egocentric datasets EPICKTICHENS100 (EK100) and EGTEA Gaze+. InAViT outperforms other visual transformer-based methods including object-centric video representation. On the EK100 evaluation server, InAViT is the top-performing method on the public leaderboard (at the time of submission) where it outperforms the second-best model by 3.3% on mean-top5 recall.

Code Repositories

lahaproject/inavit
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
action-anticipation-on-egteaInAViT
Top-1 Accuracy: 67.8
action-anticipation-on-epic-kitchens-100InAViT
Recall@5: 25.89
action-anticipation-on-epic-kitchens-100-testInAViT
recall@5: 23.75

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp