Video Instance Segmentation On Ovis 1

评估指标

AP50
AP75
AR1
AR10
mask AP

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
DVIS-DAQ(VIT-L, Offline)83.862.9--57.1DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries
CAVIS(VIT-L, Offline)82.663.521.261.857.1Context-Aware Video Instance Segmentation
DVIS++(VIT-L,Offline)78.958.5--53.4DVIS++: Improved Decoupled Framework for Universal Video Segmentation
GLEE-Pro-55.5--50.4General Object Foundation Model for Images and Videos at Scale
DVIS(Swin-L, Offline)75.953.019.455.349.9DVIS: Decoupled Video Instance Segmentation Framework
DVIS++(VIT-L, Online)72.555.020.854.649.6DVIS++: Improved Decoupled Framework for Universal Video Segmentation
UNINEXT (ViT-H, Online)72.552.2--49.0Universal Instance Perception as Object Discovery and Retrieval
DVIS(Swin-L, Online)71.949.219.452.547.1DVIS: Decoupled Video Instance Segmentation Framework
CTVIS (Swin-L)71.547.5--46.9CTVIS: Consistent Training for Online Video Instance Segmentation
RefineVIS (Swin-L, offline)70.448.419.151.246RefineVIS: Video Instance Segmentation with Temporal Attention Refinement-
GRAtt-VIS (Swin-L)69.147.819.249.445.7GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation
GenVIS (Swin-L)69.247.818.949.045.4A Generalized Framework for Video Instance Segmentation
NOVIS (Swin-L)68.343.819.446.943.5NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation-
TarViS (Swin-L)67.844.618.050.443.2TarViS: A Unified Approach for Target-based Video Segmentation
ROVIS (Swin-L)64.742.618.449.142.6Robust Online Video Instance Segmentation with Track Queries
MDQE(SwinL)67.844.318.346.542.6MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos
IDOL (Swin-L)65.745.217.949.642.6In Defense of Online Models for Video Instance Segmentation
UniVS(Swin-L)----41.7UniVS: Unified and Universal Video Segmentation with Prompts as Queries
DVIS++(R50, Offline)68.940.916.847.341.2DVIS++: Improved Decoupled Framework for Universal Video Segmentation
BoxVIS(Swin-L & Box-sup)68.439.9--40.6BoxVIS: Video Instance Segmentation with Box Annotations
0 of 44 row(s) selected.