Command Palette
Search for a command to run...
I. Zeki Yalniz; Hervé Jégou; Kan Chen; Manohar Paluri; Dhruv Mahajan

Abstract
This paper presents a study of semi-supervised learning with large convolutional networks. We propose a pipeline, based on a teacher/student paradigm, that leverages a large collection of unlabelled images (up to 1 billion). Our main goal is to improve the performance for a given target architecture, like ResNet-50 or ResNext. We provide an extensive analysis of the success factors of our approach, which leads us to formulate some recommendations to produce high-accuracy models for image classification with semi-supervised learning. As a result, our approach brings important gains to standard architectures for image, video and fine-grained classification. For instance, by leveraging one billion unlabelled images, our learned vanilla ResNet-50 achieves 81.2% top-1 accuracy on the ImageNet benchmark.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| image-classification-on-imagenet | ResNeXt-101 32x16d (semi-weakly sup.) | Number of params: 193M Top 1 Accuracy: 84.8% |
| image-classification-on-imagenet | ResNeXt-101 32x4d (semi-weakly sup.) | Number of params: 42M Top 1 Accuracy: 83.4% |
| image-classification-on-imagenet | ResNeXt-101 32x8d (semi-weakly sup.) | Number of params: 88M Top 1 Accuracy: 84.3% |
| image-classification-on-omnibenchmark | IG-1B | Average Top-1 Accuracy: 40.4 |
| object-recognition-on-shape-bias | SWSL (ResNet-50) | shape bias: 28.6 |
| object-recognition-on-shape-bias | SWSL (ResNeXt-101) | shape bias: 49.8 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.