
摘要
目标检测中定位与分类任务的复杂耦合关系,推动了相关方法的蓬勃发展。以往研究虽致力于提升各类目标检测头的性能,却未能提供一个统一的视角。本文提出一种新颖的动态检测头框架,通过引入注意力机制,实现对各类检测头的统一建模。该方法通过协同整合多层次特征间的自注意力机制以实现尺度感知(scale-awareness),空间位置间的自注意力机制以实现空间感知(spatial-awareness),以及输出通道内部的自注意力机制以实现任务感知(task-awareness),在不引入任何计算开销的前提下,显著增强了检测头的表征能力。大量实验验证了所提动态检测头在COCO基准上的有效性与高效性。采用标准的ResNeXt-101-DCN骨干网络,本方法在多个主流检测器上实现了显著性能提升,并达到54.0 AP的新纪录。进一步结合最新的Transformer骨干网络及额外训练数据,可将当前COCO最佳结果提升至60.6 AP,创下新纪录。代码将开源,地址为:https://github.com/microsoft/DynamicHead。
代码仓库
microsoft/DynamicHead
官方
pytorch
GitHub 中提及
open-mmlab/mmdetection
pytorch
Coldestadam/DynamicHead
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| object-detection-on-coco | DyHead (ResNet-101) | AP50: 64.5 AP75: 50.7 |
| object-detection-on-coco | DyHead (Swin-L, multi scale, self-training) | AP50: 78.5 AP75: 66.6 APL: 74.2 APM: 64.0 box mAP: 60.6 |
| object-detection-on-coco | DyHead (ResNet-50) | AP50: 60.7 AP75: 46.8 box mAP: 43 |
| object-detection-on-coco | DyHead (ResNeXt-64x4d-101) | AP50: 65.7 AP75: 51.9 box mAP: 47.7 |
| object-detection-on-coco | DyHead (Swin-L, multi scale) | AP50: 77.1 AP75: 64.5 APL: 72.8 APM: 62.0 box mAP: 58.7 |
| object-detection-on-coco | DyHead (ResNeXt-64x4d-101-DCN, multi scale) | AP50: 72.1 AP75: 59.3 box mAP: 54 |
| object-detection-on-coco-2017-val | DyHead (Swin-T, multi scale) | AP50: 68 AP75: 54.3 APL: 64.2 |
| object-detection-on-coco-minival | DyHead (Swin-L, multi scale, self-training) | AP50: 78.2 APL: 74.2 box AP: 60.3 |
| object-detection-on-coco-minival | DyHead (ResNet-101) | box AP: 46.5 |
| object-detection-on-coco-minival | DyHead (Swin-L, multi scale) | AP50: 76.8 APL: 73.2 APM: 62.2 APS: 44.5 box AP: 58.4 |
| object-detection-on-coco-minival | DyHead (ResNeXt-64x4d-101-DCN, multi scale) | APL: 66.3 |
| object-detection-on-coco-o | DyHead (ResNet-50) | Average mAP: 19.3 Effective Robustness: 0.16 |
| object-detection-on-coco-o | DyHead (Swin-L) | Average mAP: 35.3 Effective Robustness: 10.00 |