HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

End-to-End Semi-Supervised Learning for Video Action Detection

Kumar Akash ; Rawat Yogesh Singh

End-to-End Semi-Supervised Learning for Video Action Detection

Abstract

In this work, we focus on semi-supervised learning for video action detectionwhich utilizes both labeled as well as unlabeled data. We propose a simpleend-to-end consistency based approach which effectively utilizes the unlabeleddata. Video action detection requires both, action class prediction as well asa spatio-temporal localization of actions. Therefore, we investigate two typesof constraints, classification consistency, and spatio-temporal consistency.The presence of predominant background and static regions in a video makes itchallenging to utilize spatio-temporal consistency for action detection. Toaddress this, we propose two novel regularization constraints forspatio-temporal consistency; 1) temporal coherency, and 2) gradient smoothness.Both these aspects exploit the temporal continuity of action in videos and arefound to be effective for utilizing unlabeled videos for action detection. Wedemonstrate the effectiveness of the proposed approach on two different actiondetection benchmark datasets, UCF101-24 and JHMDB-21. In addition, we also showthe effectiveness of the proposed approach for video object segmentation on theYoutube-VOS which demonstrates its generalization capability The proposedapproach achieves competitive performance by using merely 20% of annotations onUCF101-24 when compared with recent fully supervised methods. On UCF101-24, itimproves the score by +8.9% and +11% at 0.5 f-mAP and v-mAP respectively,compared to supervised approach.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
action-detection-on-ucf101-24E2E-SSL (I3D)
Frame-mAP 0.5: 69.9
Video-mAP 0.5: 72.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
End-to-End Semi-Supervised Learning for Video Action Detection | Papers | HyperAI