Command Palette
Search for a command to run...
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Abstract
This paper introduces Grounding DINO 1.5, a suite of advanced open-set objectdetection models developed by IDEA Research, which aims to advance the "Edge"of open-set object detection. The suite encompasses two models: Grounding DINO1.5 Pro, a high-performance model designed for stronger generalizationcapability across a wide range of scenarios, and Grounding DINO 1.5 Edge, anefficient model optimized for faster speed demanded in many applicationsrequiring edge deployment. The Grounding DINO 1.5 Pro model advances itspredecessor by scaling up the model architecture, integrating an enhancedvision backbone, and expanding the training dataset to over 20 million imageswith grounding annotations, thereby achieving a richer semantic understanding.The Grounding DINO 1.5 Edge model, while designed for efficiency with reducedfeature scales, maintains robust detection capabilities by being trained on thesame comprehensive dataset. Empirical results demonstrate the effectiveness ofGrounding DINO 1.5, with the Grounding DINO 1.5 Pro model attaining a 54.3 APon the COCO detection benchmark and a 55.7 AP on the LVIS-minival zero-shottransfer benchmark, setting new records for open-set object detection.Furthermore, the Grounding DINO 1.5 Edge model, when optimized with TensorRT,achieves a speed of 75.2 FPS while attaining a zero-shot performance of 36.2 APon the LVIS-minival benchmark, making it more suitable for edge computingscenarios. Model examples and demos with API will be released athttps://github.com/IDEA-Research/Grounding-DINO-1.5-API
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| few-shot-object-detection-on-odinw-13 | Grounding DINO 1.5 Pro | Average Score: 66.3 |
| few-shot-object-detection-on-odinw-35 | Grounding DINO 1.5 Pro | Average Score: 54.7 |
| object-detection-on-lvis-v1-0-minival | Grounding DINO 1.5 Pro | box AP: 68.1 |
| object-detection-on-lvis-v1-0-val | Grounding DINO 1.5 Pro | box AP: 63.5 box APr: 64.0 |
| object-detection-on-odinw-full-shot-13-tasks | Grounding DINO 1.5 Pro | AP: 72.4 |
| object-detection-on-odinw-full-shot-35-tasks | Grounding DINO 1.5 Pro | AP: 72.4 |
| zero-shot-object-detection-on-lvis-v1-0 | Grounding DINO 1.6 Pro (without LVIS data) | AP: 57.7 |
| zero-shot-object-detection-on-lvis-v1-0 | Grounding DINO 1.5 Pro (without LVIS data) | AP: 55.7 |
| zero-shot-object-detection-on-lvis-v1-0-val | Grounding DINO 1.6 Pro (without LVIS data) | AP: 51.1 |
| zero-shot-object-detection-on-lvis-v1-0-val | Grounding DINO 1.5 Pro (without LVIS data) | AP: 47.7 |
| zero-shot-object-detection-on-mscoco | Grounding DINO 1.5 Pro (without COCO data) | AP: 54.3 |
| zero-shot-object-detection-on-mscoco | Grounding DINO 1.6 Pro (without COCO data) | AP: 55.4 |
| zero-shot-object-detection-on-odinw | Grounding DINO 1.5 Pro | Average Score: 30.2 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.