3 months ago

Boosting the Performance of Semi-Supervised Learning with Unsupervised Clustering

Boaz Lerner Guy Shiran Daphna Weinshall

Abstract

Recently, Semi-Supervised Learning (SSL) has shown much promise in leveraging unlabeled data while being provided with very few labels. In this paper, we show that ignoring the labels altogether for whole epochs intermittently during training can significantly improve performance in the small sample regime. More specifically, we propose to train a network on two tasks jointly. The primary classification task is exposed to both the unlabeled and the scarcely annotated data, whereas the secondary task seeks to cluster the data without any labels. As opposed to hand-crafted pretext tasks frequently used in self-supervision, our clustering phase utilizes the same classification network and head in an attempt to relax the primary task and propagate the information from the labels without overfitting them. On top of that, the self-supervised technique of classifying image rotations is incorporated during the unsupervised learning phase to stabilize training. We demonstrate our method's efficacy in boosting several state-of-the-art SSL algorithms, significantly improving their results and reducing running time in various standard semi-supervised benchmarks, including 92.6% accuracy on CIFAR-10 and 96.9% on SVHN, using only 4 labels per class in each task. We also notably improve the results in the extreme cases of 1,2 and 3 labels per class, and show that features learned by our model are more meaningful for separating the data.

Code Repositories

boazlern/SSClustering

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
semi-supervised-image-classification-on-cifar-15	Semi-MMDC	Percentage error: 28.1±5.5
semi-supervised-image-classification-on-cifar-17	Semi-MMDC	Accuracy (Test): 70.84±8.1
semi-supervised-image-classification-on-cifar-6	Semi-MMDC	Percentage error: 5.51±0.25
semi-supervised-image-classification-on-cifar-7	Semi-MMDC	Percentage error: 7.39±0.61
semi-supervised-image-classification-on-stl-1	Semi-MMDC	Accuracy: 95.22±0.29
semi-supervised-image-classification-on-svhn-1	Semi-MMDC	Accuracy: 97.7±0.03
semi-supervised-image-classification-on-svhn-2	Semi-MMDC	Percentage error: 3.09±0.54

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette