3 months ago

Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?

Nenad Tomasev Ioana Bica Brian McWilliams Lars Buesing Razvan Pascanu Charles Blundell Jovana Mitrovic

Abstract

Despite recent progress made by self-supervised methods in representation learning with residual networks, they still underperform supervised learning on the ImageNet classification benchmark, limiting their applicability in performance-critical settings. Building on prior theoretical insights from ReLIC [Mitrovic et al., 2021], we include additional inductive biases into self-supervised learning. We propose a new self-supervised representation learning method, ReLICv2, which combines an explicit invariance loss with a contrastive objective over a varied set of appropriately constructed data views to avoid learning spurious correlations and obtain more informative representations. ReLICv2 achieves $77.1\%$ top-$1$ accuracy on ImageNet under linear evaluation on a ResNet50, thus improving the previous state-of-the-art by absolute $+1.5\%$; on larger ResNet models, ReLICv2 achieves up to $80.6\%$ outperforming previous self-supervised approaches with margins up to $+2.3\%$. Most notably, ReLICv2 is the first unsupervised representation learning method to consistently outperform the supervised baseline in a like-for-like comparison over a range of ResNet architectures. Using ReLICv2, we also learn more robust and transferable representations that generalize better out-of-distribution than previous work, both on image classification and semantic segmentation. Finally, we show that despite using ResNet encoders, ReLICv2 is comparable to state-of-the-art self-supervised vision transformers.

Code Repositories

google-deepmind/relicv2

Official

jax

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
image-classification-on-objectnet	SimCLR	Top-1 Accuracy: 14.6
image-classification-on-objectnet	RELICv2	Top-1 Accuracy: 25.9
image-classification-on-objectnet	RELIC	Top-1 Accuracy: 23.8
image-classification-on-objectnet	BYOL	Top-1 Accuracy: 23
self-supervised-image-classification-on	ReLICv2 (ResNet101)	Number of Params: 44M Top 1 Accuracy: 78.7%
self-supervised-image-classification-on	ReLICv2 (ResNet-200 x2)	Number of Params: 250M Top 1 Accuracy: 80.6%
self-supervised-image-classification-on	ReLICv2 (ResNet-50)	Number of Params: 25M Top 1 Accuracy: 77.1%
self-supervised-image-classification-on	ReLICv2 (ResNet200)	Number of Params: 63M Top 1 Accuracy: 79.8%
self-supervised-image-classification-on	ReLICv2 (ResNet-50 4x)	Number of Params: 375M Top 1 Accuracy: 79.4%
self-supervised-image-classification-on	ReLICv2 (ResNet152)	Number of Params: 58M Top 1 Accuracy: 79.3%
self-supervised-image-classification-on	ReLICv2 (ResNet-50 x2)	Number of Params: 94M Top 1 Accuracy: 79%
semantic-segmentation-on-cityscapes-val	BYOL	mIoU: 74.6
semantic-segmentation-on-cityscapes-val	ReLICv2	mIoU: 75.2
semantic-segmentation-on-pascal-voc-2012-val	ReLICv2	mIoU: 77.9%
semantic-segmentation-on-pascal-voc-2012-val	BYOL	mIoU: 75.7%
semantic-segmentation-on-pascal-voc-2012-val	DetCon	mIoU: 77.3%
semi-supervised-image-classification-on-1	RELICv2	Top 1 Accuracy: 58.1% Top 5 Accuracy: 81.3
semi-supervised-image-classification-on-2	RELICv2 (ResNet-50)	Top 1 Accuracy: 72.4% Top 5 Accuracy: 91.2%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette