4 个月前

RoBERTa:一种稳健优化的BERT预训练方法

RoBERTa:一种稳健优化的BERT预训练方法

摘要

语言模型预训练已带来显著的性能提升,但不同方法之间的仔细比较颇具挑战性。训练过程计算成本高昂,通常在不同规模的私有数据集上进行,正如我们将展示的那样,超参数选择对最终结果有着重大影响。本文对BERT预训练(Devlin等人,2019年)进行了复制研究,仔细测量了多个关键超参数和训练数据量的影响。我们发现,BERT的训练明显不足,且其性能可以匹敌甚至超过所有在其之后发布的模型。我们的最佳模型在GLUE、RACE和SQuAD基准测试中取得了最先进的结果。这些结果突显了先前被忽视的设计选择的重要性,并对近期报告的改进来源提出了质疑。我们发布了我们的模型和代码。

代码仓库

hkuds/easyrec
pytorch
GitHub 中提及
expertailab/spaceqa
pytorch
GitHub 中提及
awslabs/mlm-scoring
mxnet
GitHub 中提及
common-english/bert-all
pytorch
GitHub 中提及
pytorch/fairseq
官方
pytorch
benywon/ReCO
pytorch
GitHub 中提及
UnknownGenie/altered-BERT-KPE
pytorch
GitHub 中提及
knuddj1/op_text
pytorch
GitHub 中提及
znhy1024/protoco
pytorch
GitHub 中提及
CalumPerrio/WNUT-2020
pytorch
GitHub 中提及
zfj1998/CodeBert-Code2Text
pytorch
GitHub 中提及
dig-team/hanna-benchmark-asg
pytorch
GitHub 中提及
flexible-fl/flex-nlp
GitHub 中提及
salesforce/codet5
pytorch
GitHub 中提及
musixmatchresearch/umberto
pytorch
GitHub 中提及
facebookresearch/anli
pytorch
GitHub 中提及
nguyenvulebinh/vietnamese-roberta
pytorch
GitHub 中提及
viethoang1512/kpa
pytorch
GitHub 中提及
knuddy/op_text
pytorch
GitHub 中提及
wzzzd/LM_NER
pytorch
GitHub 中提及
sdadas/polish-roberta
pytorch
GitHub 中提及
Tencent/TurboTransformers
pytorch
GitHub 中提及
abdumaa/hiqualprop
pytorch
GitHub 中提及
huggingface/transformers
pytorch
GitHub 中提及
oneflow-inc/libai
GitHub 中提及
clovaai/textual-kd-slu
pytorch
GitHub 中提及
aistairc/kirt_bert_on_abci
pytorch
GitHub 中提及
pisalore/roberta_results
pytorch
GitHub 中提及
bcaitech1/p2-klue-Heeseok-Jeong
pytorch
GitHub 中提及
mthcom/hscore-dataset-pruning
pytorch
GitHub 中提及
lashoun/hanna-benchmark-asg
pytorch
GitHub 中提及
kaushaltrivedi/fast-bert
pytorch
GitHub 中提及
octanove/shiba
pytorch
GitHub 中提及
traviscoan/cards
GitHub 中提及
utterworks/fast-bert
pytorch
GitHub 中提及
zaradana/Fast_BERT
pytorch
GitHub 中提及
brightmart/roberta_zh
tf
GitHub 中提及
blawok/named-entity-recognition
pytorch
GitHub 中提及

基准测试

基准方法指标
common-sense-reasoning-on-commonsenseqaRoBERTa-Large 355M
Accuracy: 72.1
common-sense-reasoning-on-swagRoBERTa
Test: 89.9
document-image-classification-on-rvl-cdipRoberta base
Accuracy: 90.06
Parameters: 125M
linguistic-acceptability-on-colaRoBERTa (ensemble)
Accuracy: 67.8%
multi-task-language-understanding-on-mmluRoBERTa-base 125M (fine-tuned)
Average (%): 27.9
natural-language-inference-on-anli-testRoBERTa (Large)
A1: 72.4
A2: 49.8
A3: 44.4
natural-language-inference-on-multinliRoBERTa
Matched: 90.8
natural-language-inference-on-multinliRoBERTa (ensemble)
Mismatched: 90.2
natural-language-inference-on-qnliRoBERTa (ensemble)
Accuracy: 98.9%
natural-language-inference-on-rteRoBERTa
Accuracy: 88.2%
natural-language-inference-on-rteRoBERTa (ensemble)
Accuracy: 88.2%
natural-language-inference-on-wnliRoBERTa (ensemble)
Accuracy: 89
question-answering-on-piqaRoBERTa-Large 355M
Accuracy: 79.4
question-answering-on-quora-question-pairsRoBERTa (ensemble)
Accuracy: 90.2%
question-answering-on-social-iqaRoBERTa-Large 355M (fine-tuned)
Accuracy: 76.7
question-answering-on-squad20RoBERTa (single model)
EM: 86.820
F1: 89.795
question-answering-on-squad20-devRoBERTa (no data aug)
EM: 86.5
F1: 89.4
reading-comprehension-on-raceRoBERTa
Accuracy: 83.2
Accuracy (High): 81.3
Accuracy (Middle): 86.5
semantic-textual-similarity-on-mrpcRoBERTa (ensemble)
Accuracy: 92.3%
semantic-textual-similarity-on-sts-benchmarkRoBERTa
Pearson Correlation: 0.922
sentiment-analysis-on-sst-2-binaryRoBERTa (ensemble)
Accuracy: 96.7
stock-market-prediction-on-astockRoBERTa WWM Ext (News+Factors)
Accuray: 62.49
F1-score: 62.54
Precision: 62.59
Recall: 62.51
stock-market-prediction-on-astockRoBERTa WWM Ext (News)
Accuray: 61.34
F1-score: 61.48
Precision: 61.97
Recall: 61.32
task-1-grouping-on-ocwRoBERTa (LARGE)
# Correct Groups: 29 ± 3
# Solved Walls: 0 ± 0
Adjusted Mutual Information (AMI): 9.4 ± .4
Adjusted Rand Index (ARI): 8.4 ± .3
Fowlkes Mallows Score (FMS): 26.7 ± .2
Wasserstein Distance (WD): 88.4 ± .4
text-classification-on-arxiv-10RoBERTa
Accuracy: 0.779
type-prediction-on-manytypes4typescriptRoBERTa
Average Accuracy: 59.84
Average F1: 57.54
Average Precision: 57.45
Average Recall: 57.62

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
RoBERTa:一种稳健优化的BERT预训练方法 | 论文 | HyperAI超神经