3 months ago

Improving Language Understanding by Generative Pre-Training

{Tim Salimans Ilya Sutskever Alec Radford Karthik Narasimhan}

Abstract

Natural language understanding comprises a wide range of diverse tasks suchas textual entailment, question answering, semantic similarity assessment, anddocument classification. Although large unlabeled text corpora are abundant,labeled data for learning these specific tasks is scarce, making it challenging fordiscriminatively trained models to perform adequately. We demonstrate that largegains on these tasks can be realized by generative pre-training of a language modelon a diverse corpus of unlabeled text, followed by discriminative fine-tuning on eachspecific task. In contrast to previous approaches, we make use of task-aware inputtransformations during fine-tuning to achieve effective transfer while requiringminimal changes to the model architecture. We demonstrate the effectiveness ofour approach on a wide range of benchmarks for natural language understanding.Our general task-agnostic model outperforms discriminatively trained models thatuse architectures specifically crafted for each task, significantly improving upon thestate of the art in 9 out of the 12 tasks studied. For instance, we achieve absoluteimprovements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% onquestion answering (RACE), and 1.5% on textual entailment (MultiNLI).

Benchmarks

Benchmark	Methodology	Metrics
natural-language-inference-on-multinli	Finetuned Transformer LM	Matched: 82.1 Mismatched: 81.4
natural-language-inference-on-scitail	Finetuned Transformer LM	Accuracy: 88.3
natural-language-inference-on-snli	Fine-Tuned LM-Pretrained Transformer	% Test Accuracy: 89.9 % Train Accuracy: 96.6 Parameters: 85m
question-answering-on-race	Finetuned Transformer LM	RACE: 59.0 RACE-h: 57.4 RACE-m: 62.9
question-answering-on-storycloze	Finetuned Transformer LM	Accuracy: 86.5

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette