HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Victor Sanh; Lysandre Debut; Julien Chaumond; Thomas Wolf

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Abstract

As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging. In this work, we propose a method to pre-train a smaller general-purpose language representation model, called DistilBERT, which can then be fine-tuned with good performances on a wide range of tasks like its larger counterparts. While most prior work investigated the use of distillation for building task-specific models, we leverage knowledge distillation during the pre-training phase and show that it is possible to reduce the size of a BERT model by 40%, while retaining 97% of its language understanding capabilities and being 60% faster. To leverage the inductive biases learned by larger models during pre-training, we introduce a triple loss combining language modeling, distillation and cosine-distance losses. Our smaller, faster and lighter model is cheaper to pre-train and we demonstrate its capabilities for on-device computations in a proof-of-concept experiment and a comparative on-device study.

Code Repositories

stefan-it/europeana-bert
tf
Mentioned in GitHub
reycn/multi-modal-scale
pytorch
Mentioned in GitHub
knuddj1/op_text
pytorch
Mentioned in GitHub
flexible-fl/flex-nlp
Mentioned in GitHub
askaydevs/distillbert-qa
pytorch
Mentioned in GitHub
dngback/co-forget-protocol
Mentioned in GitHub
lukexyz/Deep-Lyrical-Genius
pytorch
Mentioned in GitHub
jaketae/pytorch-malware-detection
pytorch
Mentioned in GitHub
knuddy/op_text
pytorch
Mentioned in GitHub
sdadas/polish-roberta
pytorch
Mentioned in GitHub
monologg/distilkobert
pytorch
Mentioned in GitHub
huggingface/transformers
Official
pytorch
Mentioned in GitHub
facebookresearch/EgoTV
pytorch
Mentioned in GitHub
frankaging/Causal-Distill
pytorch
Mentioned in GitHub
allenai/scifact
pytorch
Mentioned in GitHub
franknb/Text-Summarization
Mentioned in GitHub
epfml/collaborative-attention
pytorch
Mentioned in GitHub
twobooks/intro-aws-training
pytorch
Mentioned in GitHub
mkavim/finetune_bert
tf
Mentioned in GitHub
suinleelab/path_explain
tf
Mentioned in GitHub
ayeffkay/rubert-tiny
pytorch
Mentioned in GitHub
nageshsinghc4/deepwrap
tf
Mentioned in GitHub
enzomuschik/distilfnd
pytorch
Mentioned in GitHub
huggingface/swift-coreml-transformers
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
linguistic-acceptability-on-colaDistilBERT 66M
Accuracy: 49.1%
natural-language-inference-on-qnliDistilBERT 66M
Accuracy: 90.2%
natural-language-inference-on-rteDistilBERT 66M
Accuracy: 62.9%
natural-language-inference-on-wnliDistilBERT 66M
Accuracy: 44.4
question-answering-on-multitqDistillBERT
Hits@1: 8.3
Hits@10: 48.4
question-answering-on-quora-question-pairsDistilBERT 66M
Accuracy: 89.2%
question-answering-on-squad11-devDistilBERT 66M
F1: 85.8
question-answering-on-squad11-devDistilBERT
EM: 77.7
semantic-textual-similarity-on-mrpcDistilBERT 66M
Accuracy: 90.2%
semantic-textual-similarity-on-sts-benchmarkDistilBERT 66M
Pearson Correlation: 0.907
sentiment-analysis-on-imdbDistilBERT 66M
Accuracy: 92.82
sentiment-analysis-on-sst-2-binaryDistilBERT 66M
Accuracy: 91.3
task-1-grouping-on-ocwDistilBERT (BASE)
# Correct Groups: 49 ± 4
# Solved Walls: 0 ± 0
Adjusted Mutual Information (AMI): 14.0 ± .3
Adjusted Rand Index (ARI): 11.3 ± .3
Fowlkes Mallows Score (FMS): 29.1 ± .2
Wasserstein Distance (WD): 86.7 ± .6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp