Command Palette
Search for a command to run...
Point-Level Temporal Action Localization: Bridging Fully-supervised Proposals to Weakly-supervised Losses
Chen Ju Peisen Zhao Ya Zhang Yanfeng Wang Qi Tian

Abstract
Point-Level temporal action localization (PTAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance. Existing methods adopt the frame-level prediction paradigm to learn from the sparse single-frame labels. However, such a framework inevitably suffers from a large solution space. This paper attempts to explore the proposal-based prediction paradigm for point-level annotations, which has the advantage of more constrained solution space and consistent predictions among neighboring frames. The point-level annotations are first used as the keypoint supervision to train a keypoint detector. At the location prediction stage, a simple but effective mapper module, which enables back-propagation of training errors, is then introduced to bridge the fully-supervised framework with weak supervision. To our best of knowledge, this is the first work to leverage the fully-supervised paradigm for the point-level setting. Experiments on THUMOS14, BEOID, and GTEA verify the effectiveness of our proposed method both quantitatively and qualitatively, and demonstrate that our method outperforms state-of-the-art methods.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| weakly-supervised-action-localization-on | Ju et al. | mAP@0.1:0.5: 55.6 mAP@0.1:0.7: 44.8 mAP@0.5: 35.9 |
| weakly-supervised-action-localization-on-4 | Ju et al. | mAP@0.5: 35.9 |
| weakly-supervised-action-localization-on-5 | Ju et al. | avg-mAP (0.1-0.5): 55.6 avg-mAP (0.1:0.7): 44.8 avg-mAP (0.3-0.7): 35.4 |
| weakly-supervised-action-localization-on-6 | Ju et al. | mAP@0.1:0.7: 34.9 mAP@0.5: 20.9 |
| weakly-supervised-action-localization-on-gtea | Ju et al. | mAP@0.1:0.7: 33.7 mAP@0.5: 21.9 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.