Moment Retrieval On Qvhighlights

评估指标

R@1 IoU=0.5
R@1 IoU=0.7
mAP
mAP@0.5
mAP@0.75

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
LLaVA-MR76.5961.4852.7369.4154.40LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval
SG-DETR (w/ PT)74.2060.4058.8076.2060.80Saliency-Guided DETR for Moment Retrieval and Highlight Detection
SG-DETR72.2056.6054.1073.2055.80Saliency-Guided DETR for Moment Retrieval and Highlight Detection
InternVideo2-6B71.4256.4549.24--InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
FlashVTG70.6953.9652.0072.3353.85FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding
VideoLights-B-pt70.3655.2547.9469.5349.17VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
CG-DETR (w/ PT)68.4853.1147.9769.4049.12Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding
R^2-Tuning68.0349.3546.1769.0447.56$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding
LD-DETR 66.8051.0446.4167.61 46.99LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection
LLMEPET66.7349.9444.0565.7643.91Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval
video-mamba-suite66.6552.1945.1864.3746.68Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
UnLoc-L66.146.7---UnLoc: A Unified Framework for Video Localization Tasks
UniVTG (w/ PT)65.4350.0643.6364.0645.02UniVTG: Towards Unified Video-Language Temporal Grounding
CG-DETR65.4348.3842.8664.5142.77Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding
UVCOM (w/ PT ASR Captions)64.5348.3143.864.7843.65Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection
UnLoc-B64.548.8---UnLoc: A Unified Framework for Video Localization Tasks
QD-DETR (w/ PT)64.146.140.6264.340.5Query-Dependent Video Representation for Moment Retrieval and Highlight Detection
BAM-DETR (w/ audio)64.0748.1246.9165.6147.51BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos
LA-DETR63.9451.1047.9365.6549.44Length-Aware DETR for Robust Moment Retrieval
BAM-DETR (w/ PT ASR Captions)63.8847.9246.6766.3348.22BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos
0 of 32 row(s) selected.
Moment Retrieval On Qvhighlights | SOTA | HyperAI超神经