HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement

Yansong Peng; Hebei Li; Peixi Wu; Yueyi Zhang; Xiaoyan Sun; Feng Wu

D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement

Abstract

We introduce D-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR models. D-FINE comprises two key components: Fine-grained Distribution Refinement (FDR) and Global Optimal Localization Self-Distillation (GO-LSD). FDR transforms the regression process from predicting fixed coordinates to iteratively refining probability distributions, providing a fine-grained intermediate representation that significantly enhances localization accuracy. GO-LSD is a bidirectional optimization strategy that transfers localization knowledge from refined distributions to shallower layers through self-distillation, while also simplifying the residual prediction tasks for deeper layers. Additionally, D-FINE incorporates lightweight optimizations in computationally intensive modules and operations, achieving a better balance between speed and accuracy. Specifically, D-FINE-L / X achieves 54.0% / 55.8% AP on the COCO dataset at 124 / 78 FPS on an NVIDIA T4 GPU. When pretrained on Objects365, D-FINE-L / X attains 57.1% / 59.3% AP, surpassing all existing real-time detectors. Furthermore, our method significantly enhances the performance of a wide range of DETR models by up to 5.3% AP with negligible extra parameters and training costs. Our code and pretrained models: https://github.com/Peterande/D-FINE.

Code Repositories

Peterande/D-FINE
Official
pytorch
Mentioned in GitHub
shihuahuang95/deim
pytorch
Mentioned in GitHub
open-edge-platform/geti
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
real-time-object-detection-on-cocoD-FINE-S+
FPS (V100, b=1): 287 (T4)
box AP: 50.7
real-time-object-detection-on-cocoD-FINE-M
FPS (V100, b=1): 178 (T4)
box AP: 52.3
real-time-object-detection-on-cocoD-FINE-L
FPS (V100, b=1): 124 (T4)
box AP: 54.0
real-time-object-detection-on-cocoD-FINE-S
FPS (V100, b=1): 287 (T4)
box AP: 48.5
real-time-object-detection-on-cocoD-FINE-M+
FPS (V100, b=1): 178 (T4)
box AP: 55.1
real-time-object-detection-on-cocoD-FINE-X
FPS (V100, b=1): 78 (T4)
box AP: 55.8
real-time-object-detection-on-cocoD-FINE-X+
FPS (V100, b=1): 78 (T4)
box AP: 59.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement | Papers | HyperAI