4 months ago

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Mandar Joshi; Danqi Chen; Yinhan Liu; Daniel S. Weld; Luke Zettlemoyer; Omer Levy

Abstract

We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text. Our approach extends BERT by (1) masking contiguous random spans, rather than random tokens, and (2) training the span boundary representations to predict the entire content of the masked span, without relying on the individual token representations within it. SpanBERT consistently outperforms BERT and our better-tuned baselines, with substantial gains on span selection tasks such as question answering and coreference resolution. In particular, with the same training data and model size as BERT-large, our single model obtains 94.6% and 88.7% F1 on SQuAD 1.1 and 2.0, respectively. We also achieve a new state of the art on the OntoNotes coreference resolution task (79.6\% F1), strong performance on the TACRED relation extraction benchmark, and even show gains on GLUE.

Code Repositories

UnknownGenie/altered-BERT-KPE

pytorch

Mentioned in GitHub

wooseok-AI/Korean_e2e_CR_BERT

Mentioned in GitHub

facebookresearch/SpanBERT

Official

pytorch

Mentioned in GitHub

zixinzeng-jennifer/spanbert_trans

pytorch

Mentioned in GitHub

mandarjoshi90/coref

Mentioned in GitHub

amore-upf/masked-coreference

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
coreference-resolution-on-ontonotes	SpanBERT	F1: 79.6
linguistic-acceptability-on-cola	SpanBERT	Accuracy: 64.3%
natural-language-inference-on-multinli	SpanBERT	Matched: 88.1
natural-language-inference-on-qnli	SpanBERT	Accuracy: 94.3%
natural-language-inference-on-rte	SpanBERT	Accuracy: 79.0%
open-domain-question-answering-on-searchqa	SpanBERT	F1: 84.8
paraphrase-identification-on-quora-question	SpanBERT	Accuracy: 89.5 F1: 71.9
question-answering-on-naturalqa	SpanBERT	F1: 82.5
question-answering-on-newsqa	SpanBERT	F1: 73.6
question-answering-on-squad11	SpanBERT (single model)	EM: 88.8 F1: 94.6 Hardware Burden: 586G
question-answering-on-squad20	SpanBERT	EM: 85.7 F1: 88.7
question-answering-on-squad20-dev	SpanBERT	F1: 86.8
question-answering-on-triviaqa	SpanBERT	F1: 83.6
relation-classification-on-tacred-1	SpanBERT	F1: 70.8
relation-extraction-on-re-tacred	SpanBERT	F1: 85.3
relation-extraction-on-tacred	SpanBERT-large	F1: 70.8
semantic-textual-similarity-on-mrpc	SpanBERT	Accuracy: 90.9%
semantic-textual-similarity-on-sts-benchmark	SpanBERT	Pearson Correlation: 0.899
sentiment-analysis-on-sst-2-binary	SpanBERT	Accuracy: 94.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Mandar Joshi; Danqi Chen; Yinhan Liu; Daniel S. Weld; Luke Zettlemoyer; Omer Levy

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters