Command Palette
Search for a command to run...
Alexander Kolesnikov Lucas Beyer Xiaohua Zhai Joan Puigcerver Jessica Yung Sylvain Gelly Neil Houlsby

Abstract
Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the model on a target task. We scale up pre-training, and propose a simple recipe that we call Big Transfer (BiT). By combining a few carefully selected components, and transferring using a simple heuristic, we achieve strong performance on over 20 datasets. BiT performs well across a surprisingly wide range of data regimes -- from 1 example per class to 1M total examples. BiT achieves 87.5% top-1 accuracy on ILSVRC-2012, 99.4% on CIFAR-10, and 76.3% on the 19 task Visual Task Adaptation Benchmark (VTAB). On small datasets, BiT attains 76.8% on ILSVRC-2012 with 10 examples per class, and 97.0% on CIFAR-10 with 10 examples per class. We conduct detailed analysis of the main components that lead to high transfer performance.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| fine-grained-image-classification-on-oxford | BiT-M (ResNet) | Accuracy: 99.30% Top-1 Error Rate: 0.70 |
| fine-grained-image-classification-on-oxford | BiT-L (ResNet) | Accuracy: 99.63% Top-1 Error Rate: 0.37 |
| fine-grained-image-classification-on-oxford-2 | BiT-L (ResNet) | Accuracy: 96.62 Top-1 Error Rate: 3.38% |
| fine-grained-image-classification-on-oxford-2 | BiT-M (ResNet) | Accuracy: 94.47 Top-1 Error Rate: 5.53% |
| image-classification-on-cifar-10 | BiT-L (ResNet) | Percentage correct: 99.37 |
| image-classification-on-cifar-10 | BiT-M (ResNet) | Percentage correct: 98.91 |
| image-classification-on-cifar-100 | BiT-M (ResNet) | Percentage correct: 92.17 |
| image-classification-on-cifar-100 | BiT-L (ResNet) | Percentage correct: 93.51 |
| image-classification-on-flowers-102 | BiT-L (ResNet) | Accuracy: 99.63 |
| image-classification-on-flowers-102 | BiT-M (ResNet) | Accuracy: 99.30 |
| image-classification-on-imagenet | BiT-M (ResNet) | Number of params: 928M Top 1 Accuracy: 85.39% |
| image-classification-on-imagenet | BiT-L (ResNet) | Top 1 Accuracy: 87.54% Top 5 Accuracy: 98.46 |
| image-classification-on-imagenet-real | BiT-L | Accuracy: 90.54% Params: 928M |
| image-classification-on-imagenet-real | BiT-M | Accuracy: 89.02% |
| image-classification-on-objectnet | BiT-L (ResNet-152x4) | Top-1 Accuracy: 58.7 Top-5 Accuracy: 80 |
| image-classification-on-objectnet | BiT-M (ResNet-152x4) | Top-1 Accuracy: 47.0 Top-5 Accuracy: 69 |
| image-classification-on-objectnet | BiT-S (ResNet-152x4) | Top-1 Accuracy: 36.0 Top-5 Accuracy: 57 |
| image-classification-on-objectnet-bounding | BiT-S (ResNet) | Top 5 Accuracy: 64.4 |
| image-classification-on-objectnet-bounding | BiT-M (ResNet) | Top 5 Accuracy: 76.0 |
| image-classification-on-objectnet-bounding | BiT-L (ResNet) | Top 5 Accuracy: 85.1 |
| image-classification-on-omnibenchmark | BiT-M | Average Top-1 Accuracy: 40.4 |
| image-classification-on-vtab-1k-1 | BiT-S | Top-1 Accuracy: 66.9 |
| image-classification-on-vtab-1k-1 | BiT-L | Top-1 Accuracy: 76.3 |
| image-classification-on-vtab-1k-1 | BiT-L (50 hypers/task) | Top-1 Accuracy: 78.72 |
| image-classification-on-vtab-1k-1 | BiT-M | Top-1 Accuracy: 70.6 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.