HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection

Ma Chuofan ; Jiang Yi ; Wen Xin ; Yuan Zehuan ; Qi Xiaojuan

CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary
  Object Detection

Abstract

Deriving reliable region-word alignment from image-text pairs is critical tolearn object-level vision-language representations for open-vocabulary objectdetection. Existing methods typically rely on pre-trained or self-trainedvision-language models for alignment, which are prone to limitations inlocalization accuracy or generalization capabilities. In this paper, we proposeCoDet, a novel approach that overcomes the reliance on pre-alignedvision-language space by reformulating region-word alignment as a co-occurringobject discovery problem. Intuitively, by grouping images that mention a sharedconcept in their captions, objects corresponding to the shared concept shallexhibit high co-occurrence among the group. CoDet then leverages visualsimilarities to discover the co-occurring objects and align them with theshared concept. Extensive experiments demonstrate that CoDet has superiorperformances and compelling scalability in open-vocabulary detection, e.g., byscaling up the visual backbone, CoDet achieves 37.0 $\text{AP}^m_{novel}$ and44.7 $\text{AP}^m_{all}$ on OV-LVIS, surpassing the previous SoTA by 4.2$\text{AP}^m_{novel}$ and 9.8 $\text{AP}^m_{all}$. Code is available athttps://github.com/CVMI-Lab/CoDet.

Code Repositories

cvmi-lab/codet
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
open-vocabulary-object-detection-on-lvis-v1-0CoDet (EVA02-L)
AP novel-LVIS base training: 37.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection | Papers | HyperAI