HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
全站搜索…
⌘
K
首页
SOTA
视频物体检测
Video Object Detection On Imagenet Vid
Video Object Detection On Imagenet Vid
评估指标
MAP
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
MAP
Paper Title
Repository
YOLOV++
93.2
Practical Video Object Detection via Feature Selection and Aggregation
DiffusionVID (Swin-B)
92.5
DiffusionVID: Denoising Object Boxes with Spatio-temporal Conditioning for Video Object Detection
-
Ours (Def. DETR + SwinB)
91.3
Objects do not disappear: Video object detection by single-frame object location anticipation
VSTAM
91.1
Video Sparse Transformer With Attention-Guided Memory for Video Object Detection
-
TGBFormer (Swin B)
90.3
TGBFormer: Transformer-GraphFormer Blender Network for Video Object Detection
-
TransVOD (Swin Base)
90.1
TransVOD: End-to-End Video Object Detection with Spatial-Temporal Transformers
PTSEFormer (ResNet-101)
88.1
PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection
Ours (Def. DETR + R101)
87.9
Objects do not disappear: Video object detection by single-frame object location anticipation
YOLOV
87.5
YOLOV: Making Still Image Object Detectors Great at Video Object Detection
Ours (Faster RCNN + R101)
87.2
Objects do not disappear: Video object detection by single-frame object location anticipation
DiffusionVID (ResNet-101)
87.1
DiffusionVID: Denoising Object Boxes with Spatio-temporal Conditioning for Video Object Detection
-
DAFA-F (ResNeXt-101)
85.9
DAFA: Diversity-Aware Feature Aggregation for Attention-Based Video Object Detection
-
ClipVID
85.8
Identity-Consistent Aggregation for Video Object Detection
HVRNet (ResNeXt101-32x4d)
85.5
Mining Inter-Video Proposal Relations for Video Object Detection
-
MEGA (ResNeXt101)
85.4
Memory Enhanced Global-Local Aggregation for Video Object Detection
BoxMask(ResNeXt101)
84.8
BoxMask: Revisiting Bounding Box Supervision for Video Object Detection
-
DAFA-F (ResNet-101)
84.5
DAFA: Diversity-Aware Feature Aggregation for Attention-Based Video Object Detection
-
SELSA (ResNeXt-101)
84.3
Sequence Level Semantics Aggregation for Video Object Detection
Temporal ROI Align (ResNeXt101)
84.3
Temporal RoI Align for Video Object Recognition
REPP + SELSA (ResNet-101)
84.2
Robust and Efficient Post-Processing for Video Object Detection (REPP)
-
0 of 33 row(s) selected.
Previous
Next
Video Object Detection On Imagenet Vid | SOTA | HyperAI超神经