HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Integrating Human Gaze into Attention for Egocentric Activity Recognition

Min Kyle ; Corso Jason J.

Integrating Human Gaze into Attention for Egocentric Activity
  Recognition

Abstract

It is well known that human gaze carries significant information about visualattention. However, there are three main difficulties in incorporating the gazedata in an attention mechanism of deep neural networks: 1) the gaze fixationpoints are likely to have measurement errors due to blinking and rapid eyemovements; 2) it is unclear when and how much the gaze data is correlated withvisual attention; and 3) gaze data is not always available in many real-worldsituations. In this work, we introduce an effective probabilistic approach tointegrate human gaze into spatiotemporal attention for egocentric activityrecognition. Specifically, we represent the locations of gaze fixation pointsas structured discrete latent variables to model their uncertainties. Inaddition, we model the distribution of gaze fixations using a variationalmethod. The gaze distribution is learned during the training process so thatthe ground-truth annotations of gaze locations are no longer needed in testingsituations since they are predicted from the learned gaze distribution. Thepredicted gaze locations are used to provide informative attentional cues toimprove the recognition performance. Our method outperforms all the previousstate-of-the-art approaches on EGTEA, which is a large-scale dataset foregocentric activity recognition provided with gaze measurements. We alsoperform an ablation study and qualitative analysis to demonstrate that ourattention mechanism is effective.

Code Repositories

kylemin/Gaze-Attention
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
egocentric-activity-recognition-on-egtea-1Min et al.
Average Accuracy: 69.58
Mean class accuracy: 62.84

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Integrating Human Gaze into Attention for Egocentric Activity Recognition | Papers | HyperAI