3 months ago

Rethinking Pre-training and Self-training

Barret Zoph Golnaz Ghiasi Tsung-Yi Lin Yin Cui Hanxiao Liu Ekin D. Cubuk Quoc V. Le

Abstract

Pre-training is a dominant paradigm in computer vision. For example, supervised ImageNet pre-training is commonly used to initialize the backbones of object detection and segmentation models. He et al., however, show a surprising result that ImageNet pre-training has limited impact on COCO object detection. Here we investigate self-training as another method to utilize additional data on the same setup and contrast it against ImageNet pre-training. Our study reveals the generality and flexibility of self-training with three additional insights: 1) stronger data augmentation and more labeled data further diminish the value of pre-training, 2) unlike pre-training, self-training is always helpful when using stronger data augmentation, in both low-data and high-data regimes, and 3) in the case that pre-training is helpful, self-training improves upon pre-training. For example, on the COCO object detection dataset, pre-training benefits when we use one fifth of the labeled data, and hurts accuracy when we use all labeled data. Self-training, on the other hand, shows positive improvements from +1.3 to +3.4AP across all dataset sizes. In other words, self-training works well exactly on the same setup that pre-training does not work (using ImageNet to help COCO). On the PASCAL segmentation dataset, which is a much smaller dataset than COCO, though pre-training does help significantly, self-training improves upon the pre-trained model. On COCO object detection, we achieve 54.3AP, an improvement of +1.5AP over the strongest SpineNet model. On PASCAL segmentation, we achieve 90.5 mIOU, an improvement of +1.5% mIOU over the previous state-of-the-art result by DeepLabv3+.

Code Repositories

stanleyjzheng/PyData

Mentioned in GitHub

tensorflow/tpu/tree/master/models/official/detection/projects/self_training

Benchmarks

Benchmark	Methodology	Metrics
object-detection-on-coco	SpineNet-190 (1280, with Self-training on OpenImages, single-scale)	Hardware Burden: Operations per network pass: box mAP: 54.3
object-detection-on-coco-minival	SpineNet-190 (1280, with Self-training on OpenImages, single-scale)	box AP: 54.2
semantic-segmentation-on-pascal-voc-2012-val	EfficientNet-L2+NAS-FPN (single scale test, with self-training)	mIoU: 90.0%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette