Open Vocabulary Object Detection On Lvis V1 0

评估指标

AP novel-LVIS base training

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
LaMI-DETR43.4LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
DITO40.4Region-centric Image-Language Pretraining for Open-Vocabulary Detection
OV-DQUO(ViT-L/14)39.3OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
CoDet (EVA02-L)37.0CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
CLIPSelf34.9CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
OVMR34.4OVMR: Open-Vocabulary Recognition with Multi-Modal References
DE-ViT34.3Detect Everything with Few Examples
CFM-ViT33.9Contrastive Feature Masking Open-Vocabulary Vision Transformer-
CLIM (RN50x64)32.3CLIM: Contrastive Language-Image Mosaic for Region Representation
RO-ViT32.1Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Prova (Swin-Base)31.5Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection
RTGen30.2RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
OV-DQUO(ViT-B/16)29.7OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
ViLD-ensemble w/ ALIGN (Eb7-FPN)26.3Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
OWL-ViT (CLIP-L/14)25.6Simple Open-Vocabulary Object Detection with Vision Transformers
POMP25.2Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
BARON22.6Aligning Bag of Regions for Open-Vocabulary Object Detection
MEDet22.4Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization
Region-CLIP (RN50x4-C4)22.0RegionCLIP: Region-based Language-Image Pretraining
RALF21.9Retrieval-Augmented Open-Vocabulary Object Detection
0 of 28 row(s) selected.
Open Vocabulary Object Detection On Lvis V1 0 | SOTA | HyperAI超神经