4 months ago

ERNIE 2.0: A Continual Pre-training Framework for Language Understanding

Yu Sun; Shuohuan Wang; Yukun Li; Shikun Feng; Hao Tian; Hua Wu; Haifeng Wang

Abstract

Recently, pre-trained models have achieved state-of-the-art results in various language understanding tasks, which indicates that pre-training on large-scale corpora may play a crucial role in natural language processing. Current pre-training procedures usually focus on training the model with several simple tasks to grasp the co-occurrence of words or sentences. However, besides co-occurring, there exists other valuable lexical, syntactic and semantic information in training corpora, such as named entity, semantic closeness and discourse relations. In order to extract to the fullest extent, the lexical, syntactic and semantic information from training corpora, we propose a continual pre-training framework named ERNIE 2.0 which builds and learns incrementally pre-training tasks through constant multi-task learning. Experimental results demonstrate that ERNIE 2.0 outperforms BERT and XLNet on 16 tasks including English tasks on GLUE benchmarks and several common tasks in Chinese. The source codes and pre-trained models have been released at https://github.com/PaddlePaddle/ERNIE.

Code Repositories

PaddlePaddle/ERNIE

Official

paddle

Mentioned in GitHub

PaddlePaddle/PaddleNLP/tree/develop/model_zoo/ernie-1.0

paddle

DataScienceNigeria/ERNIE-2.0-from-Baidu-Inc.

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
chinese-named-entity-recognition-on-msra	ERNIE 2.0 Base	F1: 93.8
chinese-named-entity-recognition-on-msra	ERNIE 2.0 Large	F1: 95
chinese-named-entity-recognition-on-msra-dev	ERNIE 2.0 Large	F1: 96.3
chinese-named-entity-recognition-on-msra-dev	ERNIE 2.0 Base	F1: 95.2
linguistic-acceptability-on-cola	ERNIE 2.0 Large	Accuracy: 63.5%
linguistic-acceptability-on-cola	ERNIE 2.0 Base	Accuracy: 55.2%
natural-language-inference-on-multinli	ERNIE 2.0 Large	Matched: 88.7 Mismatched: 88.8
natural-language-inference-on-multinli	ERNIE 2.0 Base	Matched: 86.1 Mismatched: 85.5
natural-language-inference-on-qnli	ERNIE 2.0 Large	Accuracy: 94.6%
natural-language-inference-on-qnli	ERNIE 2.0 Base	Accuracy: 92.9%
natural-language-inference-on-rte	ERNIE 2.0 Base	Accuracy: 74.8%
natural-language-inference-on-rte	ERNIE 2.0 Large	Accuracy: 80.2%
natural-language-inference-on-wnli	ERNIE 2.0 Large	Accuracy: 67.8
natural-language-inference-on-xnli-chinese	ERNIE 2.0 Base	Accuracy: 81.2
natural-language-inference-on-xnli-chinese	ERNIE 2.0 Large	Accuracy: 82.6
natural-language-inference-on-xnli-chinese-1	ERNIE 2.0 Base	Accuracy: 79.7
natural-language-inference-on-xnli-chinese-1	ERNIE 2.0 Large	Accuracy: 81
open-domain-question-answering-on-dureader	ERNIE 2.0 Base	EM: 61.3
open-domain-question-answering-on-dureader	ERNIE 2.0 Large	EM: 64.2
question-answering-on-quora-question-pairs	ERNIE 2.0 Large	Accuracy: 90.1%
question-answering-on-quora-question-pairs	ERNIE 2.0 Base	Accuracy: 89.8%
semantic-textual-similarity-on-mrpc	ERNIE 2.0 Base	Accuracy: 86.1%
semantic-textual-similarity-on-mrpc	ERNIE 2.0 Large	Accuracy: 87.4%
semantic-textual-similarity-on-sts-benchmark	ERNIE 2.0 Large	Pearson Correlation: 0.912
semantic-textual-similarity-on-sts-benchmark	ERNIE 2.0 Base	Pearson Correlation: 0.876
sentiment-analysis-on-sst-2-binary	ERNIE 2.0 Base	Accuracy: 95

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette