Command Palette
Search for a command to run...
Bolei Zhou; Aditya Khosla; Agata Lapedriza; Aude Oliva; Antonio Torralba

Abstract
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels. While this technique was previously proposed as a means for regularizing training, we find that it actually builds a generic localizable deep representation that can be applied to a variety of tasks. Despite the apparent simplicity of global average pooling, we are able to achieve 37.1% top-5 error for object localization on ILSVRC 2014, which is remarkably close to the 34.2% top-5 error achieved by a fully supervised CNN approach. We demonstrate that our network is able to localize the discriminative image regions on a variety of tasks despite not being trained for them
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| weakly-supervised-object-localization-on | AlexNet-GAP | Top-1 Error Rate: 67.19 |
| weakly-supervised-object-localization-on-1 | AlexNet-GAP | Top-5 Error: 52.16 |
| weakly-supervised-object-localization-on-1 | VGGnet-GAP | Top-5 Error: 45.14 |
| weakly-supervised-object-localization-on-tiny | CAM | Top-1 Localization Accuracy: 40.55 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.