Command Palette
Search for a command to run...
The MECCANO Dataset: Understanding Human-Object Interactions from Egocentric Videos in an Industrial-like Domain
Ragusa Francesco ; Furnari Antonino ; Livatino Salvatore ; Farinella Giovanni Maria

Abstract
Wearable cameras allow to collect images and videos of humans interactingwith the world. While human-object interactions have been thoroughlyinvestigated in third person vision, the problem has been understudied inegocentric settings and in industrial scenarios. To fill this gap, we introduceMECCANO, the first dataset of egocentric videos to study human-objectinteractions in industrial-like settings. MECCANO has been acquired by 20participants who were asked to build a motorbike model, for which they had tointeract with tiny objects and tools. The dataset has been explicitly labeledfor the task of recognizing human-object interactions from an egocentricperspective. Specifically, each interaction has been labeled both temporally(with action segments) and spatially (with active object bounding boxes). Withthe proposed dataset, we investigate four different tasks including 1) actionrecognition, 2) active object detection, 3) active object recognition and 4)egocentric human-object interaction detection, which is a revisited version ofthe standard human-object interaction detection task. Baseline results showthat the MECCANO dataset is a challenging benchmark to study egocentrichuman-object interactions in industrial-like scenarios. We publicy release thedataset at https://iplab.dmi.unict.it/MECCANO.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| action-recognition-on-meccano | SlowFast | Top-1 Accuracy: 42.85 |
| human-object-interaction-detection-on-meccano | SlowFast + FasterRCNN | mAP@0.5 role: 25.93 |
| object-recognition-on-meccano | Faster-RCNN | mAP: 30.39 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.