HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Enhancing Temporal Action Localization: Advanced S6 Modeling with Recurrent Mechanism

Sangyoun Lee Juho Jung Changdae Oh Sunghee Yun

Enhancing Temporal Action Localization: Advanced S6 Modeling with Recurrent Mechanism

Abstract

Temporal Action Localization (TAL) is a critical task in video analysis, identifying precise start and end times of actions. Existing methods like CNNs, RNNs, GCNs, and Transformers have limitations in capturing long-range dependencies and temporal causality. To address these challenges, we propose a novel TAL architecture leveraging the Selective State Space Model (S6). Our approach integrates the Feature Aggregated Bi-S6 block, Dual Bi-S6 structure, and a recurrent mechanism to enhance temporal and channel-wise dependency modeling without increasing parameter complexity. Extensive experiments on benchmark datasets demonstrate state-of-the-art results with mAP scores of 74.2% on THUMOS-14, 42.9% on ActivityNet, 29.6% on FineAction, and 45.8% on HACS. Ablation studies validate our method's effectiveness, showing that the Dual structure in the Stem module and the recurrent mechanism outperform traditional approaches. Our findings demonstrate the potential of S6-based models in TAL tasks, paving the way for future research.

Code Repositories

lsy0882/RDFA-S6
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
temporal-action-localization-on-activitynetRDFA-S6 (InternVideo2-6B)
mAP: 42.9
mAP IOU@0.5: 64.1
mAP IOU@0.75: 44.0
mAP IOU@0.95: 10.6
temporal-action-localization-on-fineactionRDFA-S6 (InternVideo2-6B)
mAP: 29.6
mAP IOU@0.5: 46.4
mAP IOU@0.75: 29.5
mAP IOU@0.95: 7.6
temporal-action-localization-on-hacsRDFA-S6 (InternVideo2-6B)
Average-mAP: 45.8
mAP@0.5: 66.4
mAP@0.75: 47.2
mAP@0.95: 14.3
temporal-action-localization-on-thumos14RDFA-S6 (InternVideo2-6B)
Avg mAP (0.3:0.7): 74.2
mAP IOU@0.3: 88.7
mAP IOU@0.4: 84.6
mAP IOU@0.5: 78.2
mAP IOU@0.6: 66.6
mAP IOU@0.7: 51.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Enhancing Temporal Action Localization: Advanced S6 Modeling with Recurrent Mechanism | Papers | HyperAI