HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Weakly Supervised Action Selection Learning in Video

Junwei Ma Satya Krishna Gorti Maksims Volkovs Guangwei Yu

Weakly Supervised Action Selection Learning in Video

Abstract

Localizing actions in video is a core task in computer vision. The weakly supervised temporal localization problem investigates whether this task can be adequately solved with only video-level labels, significantly reducing the amount of expensive and error-prone annotation that is required. A common approach is to train a frame-level classifier where frames with the highest class probability are selected to make a video-level prediction. Frame level activations are then used for localization. However, the absence of frame-level annotations cause the classifier to impart class bias on every frame. To address this, we propose the Action Selection Learning (ASL) approach to capture the general concept of action, a property we refer to as "actionness". Under ASL, the model is trained with a novel class-agnostic task to predict which frames will be selected by the classifier. Empirically, we show that ASL outperforms leading baselines on two popular benchmarks THUMOS-14 and ActivityNet-1.2, with 10.3% and 5.7% relative improvement respectively. We further analyze the properties of ASL and demonstrate the importance of actionness. Full code for this work is available here: https://github.com/layer6ai-labs/ASL.

Code Repositories

layer6ai-labs/ASL
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
weakly-supervised-action-localization-on-2ASL
Mean mAP: 25.8
mAP@0.5: 40.2
weakly-supervised-action-localization-on-7ASL
mAP: 3.30
mAP IOU@0.5: 2.68
mAP IOU@0.75: 0.81
mAP IOU@0.95: 3.30

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Weakly Supervised Action Selection Learning in Video | Papers | HyperAI