4 个月前

XLNet:用于语言理解的广义自回归预训练方法

XLNet:用于语言理解的广义自回归预训练方法

摘要

凭借建模双向上下文的能力,基于去噪自编码的预训练方法(如BERT)在性能上优于基于自回归语言模型的预训练方法。然而,BERT依赖于通过掩码破坏输入数据,忽略了被掩码位置之间的依赖关系,并且存在预训练与微调之间的差异。鉴于这些优缺点,我们提出了XLNet,一种广义的自回归预训练方法,该方法(1)通过最大化所有因式分解顺序排列的期望似然性来实现学习双向上下文的目标;(2)由于其自回归公式,克服了BERT的局限性。此外,XLNet将最先进的自回归模型Transformer-XL的思想融入预训练中。实证研究表明,在相同的实验设置下,XLNet在20项任务中均优于BERT,包括问答、自然语言推理、情感分析和文档排序等任务,且通常优势显著。

基准测试

基准方法指标
document-ranking-on-clueweb09-bXLNet
ERR@20: 20.28
nDCG@20: 31.10
humor-detection-on-200k-short-texts-for-humor-1XLNet Large Cased
F1-score: 0.920
linguistic-acceptability-on-colaXLNet (single model)
Accuracy: 69%
natural-language-inference-on-anli-testXLNet (Large)
A1: 70.3
A2: 50.9
A3: 49.4
natural-language-inference-on-multinliXLNet (single model)
Matched: 90.8
natural-language-inference-on-qnliXLNet (single model)
Accuracy: 94.9%
natural-language-inference-on-rteXLNet (single model)
Accuracy: 85.9%
natural-language-inference-on-wnliXLNet
Accuracy: 92.5
paraphrase-identification-on-quora-questionXLNet-Large (ensemble)
Accuracy: 90.3
F1: 74.2
question-answering-on-quora-question-pairsXLNet (single model)
Accuracy: 92.3%
question-answering-on-raceXLNet
RACE: 81.75
RACE-m: 85.45
question-answering-on-squad11XLNet (single model)
EM: 89.898
F1: 95.080
Hardware Burden: 46449G
question-answering-on-squad11-devXLNet (single model)
EM: 89.7
F1: 95.1
question-answering-on-squad20XLNet (single model)
EM: 87.926
F1: 90.689
question-answering-on-squad20-devXLNet (single model)
EM: 87.9
F1: 90.6
reading-comprehension-on-raceXLNet
Accuracy (High): 84.0
Accuracy (Middle): 88.6
semantic-textual-similarity-on-mrpcXLNet (single model)
Accuracy: 90.8%
semantic-textual-similarity-on-sentevalXLNet-Large
MRPC: 93.0/90.7
SICK-E: -
SICK-R: -
STS: 91.6/91.1*
semantic-textual-similarity-on-sts-benchmarkXLNet (single model)
Pearson Correlation: 0.925
sentiment-analysis-on-imdbXLNet
Accuracy: 96.21
sentiment-analysis-on-sst-2-binaryXLNet-Large (ensemble)
Accuracy: 96.8
sentiment-analysis-on-sst-2-binaryXLNet (single model)
Accuracy: 97
sentiment-analysis-on-yelp-binaryXLNet
Error: 1.37
sentiment-analysis-on-yelp-fine-grainedXLNet
Error: 27.05
text-classification-on-ag-newsXLNet
Error: 4.45
text-classification-on-amazon-2XLNet
Error: 2.11
text-classification-on-amazon-5XLNet
Error: 31.67
text-classification-on-dbpediaXLNet
Error: 0.62
text-classification-on-yelp-2XLNet
Accuracy: 98.63%
text-classification-on-yelp-5XLNet
Accuracy: 72.95%

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
XLNet:用于语言理解的广义自回归预训练方法 | 论文 | HyperAI超神经