Command Palette
Search for a command to run...
Xin Lu; Buyu Li; Yuxin Yue; Quanquan Li; Junjie Yan

Abstract
This paper proposes a novel object detection framework named Grid R-CNN, which adopts a grid guided localization mechanism for accurate object detection. Different from the traditional regression based methods, the Grid R-CNN captures the spatial information explicitly and enjoys the position sensitive property of fully convolutional architecture. Instead of using only two independent points, we design a multi-point supervision formulation to encode more clues in order to reduce the impact of inaccurate prediction of specific points. To take the full advantage of the correlation of points in a grid, we propose a two-stage information fusion strategy to fuse feature maps of neighbor grid points. The grid guided localization approach is easy to be extended to different state-of-the-art detection frameworks. Grid R-CNN leads to high quality object localization, and experiments demonstrate that it achieves a 4.1% AP gain at IoU=0.8 and a 10.0% AP gain at IoU=0.9 on COCO benchmark compared to Faster R-CNN with Res50 backbone and FPN architecture.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 2d-object-detection-on-sardet-100k | Grid RCNN | box mAP: 48.8 |
| object-detection-on-coco | Grid R-CNN (ResNeXt-101-FPN) | AP50: 63.0 AP75: 46.6 APL: 55.2 APM: 46.5 APS: 25.1 Hardware Burden: Operations per network pass: box mAP: 43.2 |
| object-detection-on-coco-minival | Grid R-CNN (ResNet-50-FPN) | AP50: 58.3 AP75: 42.4 APL: 51.5 APM: 43.8 APS: 22.6 box AP: 39.6 |
| object-detection-on-coco-minival | Grid R-CNN (ResNet-101-FPN) | AP50: 60.3 AP75: 44.4 APL: 54.1 APM: 45.8 APS: 23.4 box AP: 41.3 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.