HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

Antti Tarvainen; Harri Valpola

Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

Abstract

The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks. It maintains an exponential moving average of label predictions on each training example, and penalizes predictions that are inconsistent with this target. However, because the targets change only once per epoch, Temporal Ensembling becomes unwieldy when learning large datasets. To overcome this problem, we propose Mean Teacher, a method that averages model weights instead of label predictions. As an additional benefit, Mean Teacher improves test accuracy and enables training with fewer labels than Temporal Ensembling. Without changing the network architecture, Mean Teacher achieves an error rate of 4.35% on SVHN with 250 labels, outperforming Temporal Ensembling trained with 1000 labels. We also show that a good network architecture is crucial to performance. Combining Mean Teacher and Residual Networks, we improve the state of the art on CIFAR-10 with 4000 labels from 10.55% to 6.28%, and on ImageNet 2012 with 10% of the labels from 35.24% to 9.11%.

Code Repositories

sud0301/semisup-semseg
pytorch
Mentioned in GitHub
INK-USC/DualRE
pytorch
Mentioned in GitHub
liuwei16/ALFNet
tf
Mentioned in GitHub
benathi/fastswa-semi-sup
pytorch
Mentioned in GitHub
CuriousAI/mean-teacher
Official
tf
Mentioned in GitHub
Lan1991Xu/ONE_NeurIPS2018
pytorch
Mentioned in GitHub
ZHKKKe/PixelSSL
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
semi-supervised-image-classification-on-2Mean Teacher (ResNeXt-152)
Top 5 Accuracy: 90.89%
semi-supervised-image-classification-on-cifarMean Teacher
Percentage error: 6.28
semi-supervised-image-classification-on-cifar-6MeanTeacher
Percentage error: 47.32
semi-supervised-image-classification-on-svhnMean Teacher
Accuracy: 96.05
semi-supervised-image-classification-on-svhn-1MeanTeacher
Accuracy: 93.55
semi-supervised-semantic-segmentation-on-23MeanTeacher (Range View)
mIoU (1% Labels): 34.2
mIoU (10% Labels): 49.8
mIoU (20% Labels): 51.6
mIoU (50% Labels): 53.3
semi-supervised-semantic-segmentation-on-23MeanTeacher (Voxel)
mIoU (1% Labels): 41.0
mIoU (10% Labels): 50.1
mIoU (20% Labels): 52.8
mIoU (50% Labels): 53.9
semi-supervised-semantic-segmentation-on-24MeanTeacher (Range View)
mIoU (1% Labels): 37.5
mIoU (10% Labels): 53.1
mIoU (20% Labels): 56.1
mIoU (50% Labels): 57.4
semi-supervised-semantic-segmentation-on-24MeanTeacher (Voxel)
mIoU (1% Labels): 45.4
mIoU (10% Labels): 57.1
mIoU (20% Labels): 59.2
mIoU (50% Labels): 60.0
semi-supervised-semantic-segmentation-on-25MeanTeacher (Range View)
mIoU (1% Labels): 42.1
mIoU (10% Labels): 60.4
mIoU (20% Labels): 65.4
mIoU (50% Labels): 69.4
semi-supervised-semantic-segmentation-on-25MeanTeacher (Voxel)
mIoU (1% Labels): 51.6
mIoU (10% Labels): 66.0
mIoU (20% Labels): 67.1
mIoU (50% Labels): 71.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results | Papers | HyperAI