Command Palette
Search for a command to run...
Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection
Li Xiang ; Wang Wenhai ; Hu Xiaolin ; Li Jun ; Tang Jinhui ; Yang Jian

Abstract
Localization Quality Estimation (LQE) is crucial and popular in the recentadvancement of dense object detectors since it can provide accurate rankingscores that benefit the Non-Maximum Suppression processing and improvedetection performance. As a common practice, most existing methods predict LQEscores through vanilla convolutional features shared with object classificationor bounding box regression. In this paper, we explore a completely novel anddifferent perspective to perform LQE -- based on the learned distributions ofthe four parameters of the bounding box. The bounding box distributions areinspired and introduced as "General Distribution" in GFLV1, which describes theuncertainty of the predicted bounding boxes well. Such a property makes thedistribution statistics of a bounding box highly correlated to its reallocalization quality. Specifically, a bounding box distribution with a sharppeak usually corresponds to high localization quality, and vice versa. Byleveraging the close correlation between distribution statistics and the reallocalization quality, we develop a considerably lightweight Distribution-GuidedQuality Predictor (DGQP) for reliable LQE based on GFLV1, thus producing GFLV2.To our best knowledge, it is the first attempt in object detection to use ahighly relevant, statistical representation to facilitate LQE. Extensiveexperiments demonstrate the effectiveness of our method. Notably, GFLV2(ResNet-101) achieves 46.2 AP at 14.6 FPS, surpassing the previousstate-of-the-art ATSS baseline (43.6 AP at 14.6 FPS) by absolute 2.6 AP on COCO{\tt test-dev}, without sacrificing the efficiency both in training andinference. Code will be available at https://github.com/implus/GFocalV2.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| object-detection-on-coco | GFLV2 (ResNeXt-101, 32x4d, DCN) | AP50: 67.6 AP75: 53.5 APL: 61.4 APM: 52.4 APS: 29.7 Hardware Burden: 3G Operations per network pass: box mAP: 49 |
| object-detection-on-coco | GFLV2 (ResNet-101-DCN) | AP50: 66.5 AP75: 52.8 APL: 60.7 APM: 51.9 APS: 28.8 Hardware Burden: 3G Operations per network pass: box mAP: 48.3 |
| object-detection-on-coco | GFLV2 (Res2Net-101, DCN) | AP50: 69 AP75: 55.3 APL: 63.5 APM: 54.3 APS: 31.3 Hardware Burden: Operations per network pass: box mAP: 50.6 |
| object-detection-on-coco | GFLV2 (ResNet-101) | AP50: 64.3 AP75: 50.5 APL: 57 APM: 49.9 APS: 27.8 Hardware Burden: Operations per network pass: box mAP: 46.2 |
| object-detection-on-coco | GFLV2 (ResNet-50) | AP50: 62.3 AP75: 48.5 APL: 54.1 APM: 47.7 APS: 26.8 Hardware Burden: Operations per network pass: box mAP: 44.3 |
| object-detection-on-coco | GFLV2 (Res2Net-101, DCN, multiscale) | AP50: 70.9 AP75: 59.2 APL: 65.6 APM: 56.1 APS: 35.7 Hardware Burden: Operations per network pass: box mAP: 53.3 |
| object-detection-on-coco-o | GFLv2 (R2-101-DCN) | Average mAP: 25.1 Effective Robustness: 2.6 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.