Command Palette
Search for a command to run...
Relation DETR: Exploring Explicit Position Relation Prior for Object Detection
Hou Xiuquan ; Liu Meiqin ; Zhang Senlin ; Wei Ping ; Chen Badong ; Lan Xuguang

Abstract
This paper presents a general scheme for enhancing the convergence andperformance of DETR (DEtection TRansformer). We investigate the slowconvergence problem in transformers from a new perspective, suggesting that itarises from the self-attention that introduces no structural bias over inputs.To address this issue, we explore incorporating position relation prior asattention bias to augment object detection, following the verification of itsstatistical significance using a proposed quantitative macroscopic correlation(MC) metric. Our approach, termed Relation-DETR, introduces an encoder toconstruct position relation embeddings for progressive attention refinement,which further extends the traditional streaming pipeline of DETR into acontrastive relation pipeline to address the conflicts between non-duplicatepredictions and positive supervision. Extensive experiments on both generic andtask-specific datasets demonstrate the effectiveness of our approach. Under thesame configurations, Relation-DETR achieves a significant improvement (+2.0% APcompared to DINO), state-of-the-art performance (51.7% AP for 1x and 52.1% APfor 2x settings), and a remarkably faster convergence speed (over 40% AP withonly 2 training epochs) than existing DETR detectors on COCO val2017. Moreover,the proposed relation encoder serves as a universal plug-in-and-play component,bringing clear improvements for theoretically any DETR-like methods.Furthermore, we introduce a class-agnostic detection dataset, SA-Det-100k. Theexperimental results on the dataset illustrate that the proposed explicitposition relation achieves a clear improvement of 1.3% AP, highlighting itspotential towards universal object detection. The code and dataset areavailable at https://github.com/xiuqhou/Relation-DETR.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| object-detection-on-coco | Relation-DETR (Focal-L) | AP50: 80.8 AP75: 69.1 APL: 77.0 APM: 66.9 APS: 47.2 Params (M): 214 box mAP: 63.5 |
| object-detection-on-coco-2017-val | Relation-DETR (ResNet50 2x) | AP: 52.1 AP50: 69.7 AP75: 56.6 APL: 66.5 APM: 56.0 APS: 36.1 |
| object-detection-on-coco-2017-val | Relation-DETR (Swin-L 2x) | AP: 58.1 AP50: 76.4 AP75: 63.5 APL: 73.5 APM: 63.0 APS: 41.8 |
| object-detection-on-coco-2017-val | Relation-DETR (ResNet50 1x) | AP: 51.7 AP50: 69.1 AP75: 56.3 APL: 66.1 APM: 55.6 APS: 36.1 |
| object-detection-on-coco-2017-val | Relation-DETR (Swin-L 1x) | AP: 57.8 AP50: 76.1 AP75: 62.9 APL: 74.4 APM: 62.1 APS: 41.2 |
| object-detection-on-sa-det-100k | Relation-DETR (ResNet50 1x) | AP: 45.0 AP50: 53.1 AP75: 48.9 APL: 62.9 APM: 44.4 APS: 6.0 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.