5 months ago

Deep contextualized word representations

Matthew E. Peters; Mark Neumann; Mohit Iyyer; Matt Gardner; Christopher Clark; Kenton Lee; Luke Zettlemoyer

Abstract

We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus. We show that these representations can be easily added to existing models and significantly improve the state of the art across six challenging NLP problems, including question answering, textual entailment and sentiment analysis. We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.

Code Repositories

bplank/teaching-dl4nlp

Mentioned in GitHub

menajosep/AleatoricSent

Mentioned in GitHub

TEAMLAB-Lecture/deep_nlp_101

Mentioned in GitHub

zenanz/ChemPatentEmbeddings

Mentioned in GitHub

yangrui123/Hidden

Mentioned in GitHub

UKPLab/elmo-bilstm-cnn-crf

Mentioned in GitHub

JHart96/keras_elmo_embedding_layer

Mentioned in GitHub

flairNLP/flair

pytorch

Mentioned in GitHub

LamLauChiu/Tensorflow_Learning

Mentioned in GitHub

Mind23-2/MindCode-98

mindspore

Mentioned in GitHub

shelleyHLX/bilm_EMLo

Mentioned in GitHub

PrashantRanjan09/WordEmbeddings-Elmo-Fasttext-Word2Vec

Mentioned in GitHub

Mind23-2/MindCode-135

mingdachen/bilm-tf

Mentioned in GitHub

horizonheart/ELMO

Mentioned in GitHub

seunghwan1228/ELMO

Mentioned in GitHub

sarveshsparab/DeepElmoEmbedNer

Mentioned in GitHub

richinkabra/CoVe-BCN

pytorch

Mentioned in GitHub

ClaudiaRaffaelli/Protein-subcellular-localization

Mentioned in GitHub

yuanxiaosc/ELMo

Mentioned in GitHub

kaist-dmlab/BioNER

pytorch

Mentioned in GitHub

Hironsan/anago

Mentioned in GitHub

ankurbanga/Language-Models

pytorch

Mentioned in GitHub

PrashantRanjan09/Elmo-Tutorial

Mentioned in GitHub

dmlc/gluon-nlp

mxnet

bestend/tf2-bi-lstm-crf-nni

Mentioned in GitHub

kunde122/bilm-tf

Mentioned in GitHub

kafura-kafiri/tf2-elmo

Mentioned in GitHub

AshwinDeshpande96/Hierarchical-Softmax

Mentioned in GitHub

young-zonglin/bilm-tf-extended

Mentioned in GitHub

helboukkouri/character-bert

pytorch

Mentioned in GitHub

HIT-SCIR/ELMoForManyLangs

pytorch

Mentioned in GitHub

nlp-research/bilm-tf

Mentioned in GitHub

yuanjing-zhu/elmo

pytorch

Mentioned in GitHub

yangzonglin1994/bilm-tf-extended

Mentioned in GitHub

RundongChou/elmo-chinese-oversimplified

Mentioned in GitHub

griff4692/LMC

pytorch

Mentioned in GitHub

iliaschalkidis/ELMo-keras

Mentioned in GitHub

2023-MindSpore-1/ms-code-190

mindspore

Mentioned in GitHub

IMPLabUniPr/UniParma-at-semeval-2021-task-5

pytorch

Mentioned in GitHub

YC-wind/embedding_study

Mentioned in GitHub

cheng18/bilm-tf

Mentioned in GitHub

weixsong/bilm-tf

Mentioned in GitHub

allenai/bilm-tf

Mentioned in GitHub

shaneding/bilm-tf-experimentation

Mentioned in GitHub

kinimod23/NMT_Project

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
citation-intent-classification-on-acl-arc	BiLSTM-Attention + ELMo	Macro-F1: 54.6
conversational-response-selection-on-polyai	ELMO	1-of-100 Accuracy: 19.3%
coreference-resolution-on-ontonotes	e2e-coref + ELMo	F1: 70.4
named-entity-recognition-ner-on-conll-2003	BiLSTM-CRF+ELMo	F1: 92.22
named-entity-recognition-on-conll	BiLSTM-CRF+ELMo	F1: 93.42
natural-language-inference-on-snli	ESIM + ELMo Ensemble	% Test Accuracy: 89.3 % Train Accuracy: 92.1 Parameters: 40m
natural-language-inference-on-snli	ESIM + ELMo	% Test Accuracy: 88.7 % Train Accuracy: 91.6 Parameters: 8.0m
question-answering-on-squad11	BiDAF + Self Attention + ELMo (ensemble)	EM: 81.003 F1: 87.432
question-answering-on-squad11	BiDAF + Self Attention + ELMo (single model)	EM: 78.58 F1: 85.833
question-answering-on-squad11-dev	BiDAF + Self Attention + ELMo	F1: 85.6
question-answering-on-squad20	BiDAF + Self Attention + ELMo (single model)	EM: 63.372 F1: 66.251
semantic-role-labeling-on-ontonotes	He et al., 2017 + ELMo	F1: 84.6
sentiment-analysis-on-sst-5-fine-grained	BCN+ELMo	Accuracy: 54.7
task-1-grouping-on-ocw	ELMo (LARGE)	# Correct Groups: 55 ± 4 # Solved Walls: 0 ± 0 Adjusted Mutual Information (AMI): 14.5 ± .4 Adjusted Rand Index (ARI): 11.8 ± .4 Fowlkes Mallows Score (FMS): 29.5 ± .3 Wasserstein Distance (WD): 86.3 ± .6
word-sense-disambiguation-on-supervised	ELMo	SemEval 2007: 62.2 SemEval 2013: 66.2 SemEval 2015: 71.3 Senseval 2: 71.6 Senseval 3: 69.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Deep contextualized word representations

Matthew E. Peters; Mark Neumann; Mohit Iyyer; Matt Gardner; Christopher Clark; Kenton Lee; Luke Zettlemoyer

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters