5 months ago

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection

Hou Xiuquan ; Liu Meiqin ; Zhang Senlin ; Wei Ping ; Chen Badong ; Lan Xuguang

Abstract

This paper presents a general scheme for enhancing the convergence andperformance of DETR (DEtection TRansformer). We investigate the slowconvergence problem in transformers from a new perspective, suggesting that itarises from the self-attention that introduces no structural bias over inputs.To address this issue, we explore incorporating position relation prior asattention bias to augment object detection, following the verification of itsstatistical significance using a proposed quantitative macroscopic correlation(MC) metric. Our approach, termed Relation-DETR, introduces an encoder toconstruct position relation embeddings for progressive attention refinement,which further extends the traditional streaming pipeline of DETR into acontrastive relation pipeline to address the conflicts between non-duplicatepredictions and positive supervision. Extensive experiments on both generic andtask-specific datasets demonstrate the effectiveness of our approach. Under thesame configurations, Relation-DETR achieves a significant improvement (+2.0% APcompared to DINO), state-of-the-art performance (51.7% AP for 1x and 52.1% APfor 2x settings), and a remarkably faster convergence speed (over 40% AP withonly 2 training epochs) than existing DETR detectors on COCO val2017. Moreover,the proposed relation encoder serves as a universal plug-in-and-play component,bringing clear improvements for theoretically any DETR-like methods.Furthermore, we introduce a class-agnostic detection dataset, SA-Det-100k. Theexperimental results on the dataset illustrate that the proposed explicitposition relation achieves a clear improvement of 1.3% AP, highlighting itspotential towards universal object detection. The code and dataset areavailable at https://github.com/xiuqhou/Relation-DETR.

Code Repositories

xiuqhou/Salience-DETR

pytorch

xiuqhou/relation-detr

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
object-detection-on-coco	Relation-DETR (Focal-L)	AP50: 80.8 AP75: 69.1 APL: 77.0 APM: 66.9 APS: 47.2 Params (M): 214 box mAP: 63.5
object-detection-on-coco-2017-val	Relation-DETR (ResNet50 2x)	AP: 52.1 AP50: 69.7 AP75: 56.6 APL: 66.5 APM: 56.0 APS: 36.1
object-detection-on-coco-2017-val	Relation-DETR (Swin-L 2x)	AP: 58.1 AP50: 76.4 AP75: 63.5 APL: 73.5 APM: 63.0 APS: 41.8
object-detection-on-coco-2017-val	Relation-DETR (ResNet50 1x)	AP: 51.7 AP50: 69.1 AP75: 56.3 APL: 66.1 APM: 55.6 APS: 36.1
object-detection-on-coco-2017-val	Relation-DETR (Swin-L 1x)	AP: 57.8 AP50: 76.1 AP75: 62.9 APL: 74.4 APM: 62.1 APS: 41.2
object-detection-on-sa-det-100k	Relation-DETR (ResNet50 1x)	AP: 45.0 AP50: 53.1 AP75: 48.9 APL: 62.9 APM: 44.4 APS: 6.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette