Temporal Action Localization On Activitynet

评估指标

mAP
mAP IOU@0.5
mAP IOU@0.75
mAP IOU@0.95

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
RDFA-S6 (InternVideo2-6B)42.964.144.010.6Enhancing Temporal Action Localization: Advanced S6 Modeling with Recurrent Mechanism
ActionMamba (InternVideo2-6B)42.0262.4343.4910.23Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
PRN+BMN (ensemble)42.059.7--Proposal Relation Network for Temporal Action Detection
AdaTAD (VideoMAEv2-giant)41.9361.7243.3510.85End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
InternVideo2-6B41.2---InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
InternVideo2-1B40.4---InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
UniMD+Sync.39.8360.29--UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection
PRN (CSN)39.457.9--Proposal Relation Network for Temporal Action Detection
InternVideo39.00---InternVideo: General Video Foundation Models via Generative and Discriminative Learning
TCANet (SlowFast R101)37.5654.3339.138.41Temporal Context Aggregation Network for Temporal Action Proposal Refinement
PRN (ViViT)37.555.5--Proposal Relation Network for Temporal Action Detection
AVFusion36.8254.3437.668.93Hear Me Out: Fusional Approaches for Audio Augmented Temporal Action Localization
TriDet (TSP features)36.854.738.08.4TriDet: Temporal Action Detection with Relative Boundary Modeling
TadTR (TSP features)36.7553.6237.5210.56End-to-end Temporal Action Detection with Transformer
ActionFormer (TSP feautures)36.654.737.88.4ActionFormer: Localizing Moments of Actions with Transformers
TAGS (I3D)36.5---Proposal-Free Temporal Action Detection via Global Segmentation Mask Learning
VSGN (TSP features)35.9453.2636.768.12Video Self-Stitching Graph Network for Temporal Action Localization
TSP35.8151.2637.129.29TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks
HCN(I3D features)35.6152.5136.107.12Improve Temporal Action Proposals using Hierarchical Context-
DCAN (TSN features)35.3951.7835.989.45DCAN: Improving Temporal Action Detection via Dual Context Aggregation
0 of 33 row(s) selected.
Temporal Action Localization On Activitynet | SOTA | HyperAI超神经