HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Improving Language Understanding by Generative Pre-Training

{Tim Salimans Ilya Sutskever Alec Radford Karthik Narasimhan}

Abstract

Natural language understanding comprises a wide range of diverse tasks suchas textual entailment, question answering, semantic similarity assessment, anddocument classification. Although large unlabeled text corpora are abundant,labeled data for learning these specific tasks is scarce, making it challenging fordiscriminatively trained models to perform adequately. We demonstrate that largegains on these tasks can be realized by generative pre-training of a language modelon a diverse corpus of unlabeled text, followed by discriminative fine-tuning on eachspecific task. In contrast to previous approaches, we make use of task-aware inputtransformations during fine-tuning to achieve effective transfer while requiringminimal changes to the model architecture. We demonstrate the effectiveness ofour approach on a wide range of benchmarks for natural language understanding.Our general task-agnostic model outperforms discriminatively trained models thatuse architectures specifically crafted for each task, significantly improving upon thestate of the art in 9 out of the 12 tasks studied. For instance, we achieve absoluteimprovements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% onquestion answering (RACE), and 1.5% on textual entailment (MultiNLI).

Benchmarks

BenchmarkMethodologyMetrics
natural-language-inference-on-multinliFinetuned Transformer LM
Matched: 82.1
Mismatched: 81.4
natural-language-inference-on-scitailFinetuned Transformer LM
Accuracy: 88.3
natural-language-inference-on-snliFine-Tuned LM-Pretrained Transformer
% Test Accuracy: 89.9
% Train Accuracy: 96.6
Parameters: 85m
question-answering-on-raceFinetuned Transformer LM
RACE: 59.0
RACE-h: 57.4
RACE-m: 62.9
question-answering-on-storyclozeFinetuned Transformer LM
Accuracy: 86.5

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Improving Language Understanding by Generative Pre-Training | Papers | HyperAI