HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning

Can Zhang Meng Cao Dongming Yang Jie Chen Yuexian Zou

CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning

Abstract

Weakly-supervised temporal action localization (WS-TAL) aims to localize actions in untrimmed videos with only video-level labels. Most existing models follow the "localization by classification" procedure: locate temporal regions contributing most to the video-level classification. Generally, they process each snippet (or frame) individually and thus overlook the fruitful temporal context relation. Here arises the single snippet cheating issue: "hard" snippets are too vague to be classified. In this paper, we argue that learning by comparing helps identify these hard snippets and we propose to utilize snippet Contrastive learning to Localize Actions, CoLA for short. Specifically, we propose a Snippet Contrast (SniCo) Loss to refine the hard snippet representation in feature space, which guides the network to perceive precise temporal boundaries and avoid the temporal interval interruption. Besides, since it is infeasible to access frame-level annotations, we introduce a Hard Snippet Mining algorithm to locate the potential hard snippets. Substantial analyses verify that this mining strategy efficaciously captures the hard snippets and SniCo Loss leads to more informative feature representation. Extensive experiments show that CoLA achieves state-of-the-art results on THUMOS'14 and ActivityNet v1.2 datasets. CoLA code is publicly available at https://github.com/zhang-can/CoLA.

Code Repositories

zhang-can/CoLA
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
weakly-supervised-action-localization-onCoLA
mAP@0.1:0.5: 50.3
mAP@0.1:0.7: 40.9
mAP@0.5: 32.2
weakly-supervised-action-localization-on-2CoLA
Mean mAP: 26.1
mAP@0.5: 42.7
weakly-supervised-action-localization-on-4CoLA
mAP@0.5: 32.2
weakly-supervised-action-localization-on-5CoLA
avg-mAP (0.1-0.5): 50.3
avg-mAP (0.1:0.7): 40.9
avg-mAP (0.3-0.7): 32.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning | Papers | HyperAI