HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Human-Object Interaction Prediction in Videos through Gaze Following

Ni Zhifan ; Mascaró Esteve Valls ; Ahn Hyemin ; Lee Dongheui

Human-Object Interaction Prediction in Videos through Gaze Following

Abstract

Understanding the human-object interactions (HOIs) from a video is essentialto fully comprehend a visual scene. This line of research has been addressed bydetecting HOIs from images and lately from videos. However, the video-based HOIanticipation task in the third-person view remains understudied. In this paper,we design a framework to detect current HOIs and anticipate future HOIs invideos. We propose to leverage human gaze information since people often fixateon an object before interacting with it. These gaze features together with thescene contexts and the visual appearances of human-object pairs are fusedthrough a spatio-temporal transformer. To evaluate the model in the HOIanticipation task in a multi-person scenario, we propose a set of person-wisemulti-label metrics. Our model is trained and validated on the VidHOI dataset,which contains videos capturing daily life and is currently the largest videoHOI dataset. Experimental results in the HOI detection task show that ourapproach improves the baseline by a great margin of 36.3% relatively. Moreover,we conduct an extensive ablation study to demonstrate the effectiveness of ourmodifications and extensions to the spatio-temporal transformer. Our code ispublicly available on https://github.com/nizhf/hoi-prediction-gaze-transformer.

Code Repositories

nizhf/hoi-prediction-gaze-transformer
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
human-object-interaction-anticipation-onST-GAZE
Person-wise Top5: t=1(mAP@0.5): 37.59
Person-wise Top5: t=3(mAP@0.5): 33.14
Person-wise Top5: t=5(mAP@0.5): 32.75
human-object-interaction-detection-on-vidhoiST-GAZE
Detection: Full (mAP@0.5): 10.4
Detection: Non-Rare (mAP@0.5): 16.83
Detection: Rare (mAP@0.5): 5.46
Oracle: Full (mAP@0.5): 38.61
Oracle: Non-Rare (mAP@0.5): 52.44
Oracle: Rare (mAP@0.5): 27.99

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp