HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Described Object Detection: Liberating Object Detection with Flexible Expressions

Xie Chi ; Zhang Zhao ; Wu Yixuan ; Zhu Feng ; Zhao Rui ; Liang Shuang

Described Object Detection: Liberating Object Detection with Flexible
  Expressions

Abstract

Detecting objects based on language information is a popular task thatincludes Open-Vocabulary object Detection (OVD) and Referring ExpressionComprehension (REC). In this paper, we advance them to a more practical settingcalled Described Object Detection (DOD) by expanding category names to flexiblelanguage expressions for OVD and overcoming the limitation of REC onlygrounding the pre-existing object. We establish the research foundation for DODby constructing a Description Detection Dataset ($D^3$). This dataset featuresflexible language expressions, whether short category names or longdescriptions, and annotating all described objects on all images withoutomission. By evaluating previous SOTA methods on $D^3$, we find sometroublemakers that fail current REC, OVD, and bi-functional methods. RECmethods struggle with confidence scores, rejecting negative instances, andmulti-target scenarios, while OVD methods face constraints with long andcomplex descriptions. Recent bi-functional methods also do not work well on DODdue to their separated training procedures and inference strategies for REC andOVD tasks. Building upon the aforementioned findings, we propose a baselinethat largely improves REC methods by reconstructing the training data andintroducing a binary classification sub-task, outperforming existing methods.Data and code are available at https://github.com/shikras/d-cube and relatedworks are tracked inhttps://github.com/Charles-Xie/awesome-described-object-detection.

Code Repositories

shikras/d-cube
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
described-object-detection-on-descriptionOFA-DOD-base
Intra-scenario ABS mAP: 15.4
Intra-scenario FULL mAP: 21.6
Intra-scenario PRES mAP: 23.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Described Object Detection: Liberating Object Detection with Flexible Expressions | Papers | HyperAI