3 months ago

Big Transfer (BiT): General Visual Representation Learning

Alexander Kolesnikov Lucas Beyer Xiaohua Zhai Joan Puigcerver Jessica Yung Sylvain Gelly Neil Houlsby

Abstract

Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the model on a target task. We scale up pre-training, and propose a simple recipe that we call Big Transfer (BiT). By combining a few carefully selected components, and transferring using a simple heuristic, we achieve strong performance on over 20 datasets. BiT performs well across a surprisingly wide range of data regimes -- from 1 example per class to 1M total examples. BiT achieves 87.5% top-1 accuracy on ILSVRC-2012, 99.4% on CIFAR-10, and 76.3% on the 19 task Visual Task Adaptation Benchmark (VTAB). On small datasets, BiT attains 76.8% on ILSVRC-2012 with 10 examples per class, and 97.0% on CIFAR-10 with 10 examples per class. We conduct detailed analysis of the main components that lead to high transfer performance.

Code Repositories

sayakpaul/FunMatch-Distillation

Mentioned in GitHub

batsresearch/taglets

pytorch

Mentioned in GitHub

MS-Mind/MS-Code-02/tree/main/configs/bit

mindspore

2024-MindSpore-1/Code2/tree/main/model-1/bit

mindspore

SoojungYang/supervised_pretraining_GN_WS

Mentioned in GitHub

bethgelab/InDomainGeneralizationBenchmark

pytorch

Mentioned in GitHub

google-research/big_transfer

Official

jax

Mentioned in GitHub

hw666666666666/BigTransfer

mindspore

sayakpaul/A-Barebones-Image-Retrieval-System

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
fine-grained-image-classification-on-oxford	BiT-M (ResNet)	Accuracy: 99.30% Top-1 Error Rate: 0.70
fine-grained-image-classification-on-oxford	BiT-L (ResNet)	Accuracy: 99.63% Top-1 Error Rate: 0.37
fine-grained-image-classification-on-oxford-2	BiT-L (ResNet)	Accuracy: 96.62 Top-1 Error Rate: 3.38%
fine-grained-image-classification-on-oxford-2	BiT-M (ResNet)	Accuracy: 94.47 Top-1 Error Rate: 5.53%
image-classification-on-cifar-10	BiT-L (ResNet)	Percentage correct: 99.37
image-classification-on-cifar-10	BiT-M (ResNet)	Percentage correct: 98.91
image-classification-on-cifar-100	BiT-M (ResNet)	Percentage correct: 92.17
image-classification-on-cifar-100	BiT-L (ResNet)	Percentage correct: 93.51
image-classification-on-flowers-102	BiT-L (ResNet)	Accuracy: 99.63
image-classification-on-flowers-102	BiT-M (ResNet)	Accuracy: 99.30
image-classification-on-imagenet	BiT-M (ResNet)	Number of params: 928M Top 1 Accuracy: 85.39%
image-classification-on-imagenet	BiT-L (ResNet)	Top 1 Accuracy: 87.54% Top 5 Accuracy: 98.46
image-classification-on-imagenet-real	BiT-L	Accuracy: 90.54% Params: 928M
image-classification-on-imagenet-real	BiT-M	Accuracy: 89.02%
image-classification-on-objectnet	BiT-L (ResNet-152x4)	Top-1 Accuracy: 58.7 Top-5 Accuracy: 80
image-classification-on-objectnet	BiT-M (ResNet-152x4)	Top-1 Accuracy: 47.0 Top-5 Accuracy: 69
image-classification-on-objectnet	BiT-S (ResNet-152x4)	Top-1 Accuracy: 36.0 Top-5 Accuracy: 57
image-classification-on-objectnet-bounding	BiT-S (ResNet)	Top 5 Accuracy: 64.4
image-classification-on-objectnet-bounding	BiT-M (ResNet)	Top 5 Accuracy: 76.0
image-classification-on-objectnet-bounding	BiT-L (ResNet)	Top 5 Accuracy: 85.1
image-classification-on-omnibenchmark	BiT-M	Average Top-1 Accuracy: 40.4
image-classification-on-vtab-1k-1	BiT-S	Top-1 Accuracy: 66.9
image-classification-on-vtab-1k-1	BiT-L	Top-1 Accuracy: 76.3
image-classification-on-vtab-1k-1	BiT-L (50 hypers/task)	Top-1 Accuracy: 78.72
image-classification-on-vtab-1k-1	BiT-M	Top-1 Accuracy: 70.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Big Transfer (BiT): General Visual Representation Learning

Alexander Kolesnikov Lucas Beyer Xiaohua Zhai Joan Puigcerver Jessica Yung Sylvain Gelly Neil Houlsby

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters