HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Mandar Joshi; Danqi Chen; Yinhan Liu; Daniel S. Weld; Luke Zettlemoyer; Omer Levy

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Abstract

We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text. Our approach extends BERT by (1) masking contiguous random spans, rather than random tokens, and (2) training the span boundary representations to predict the entire content of the masked span, without relying on the individual token representations within it. SpanBERT consistently outperforms BERT and our better-tuned baselines, with substantial gains on span selection tasks such as question answering and coreference resolution. In particular, with the same training data and model size as BERT-large, our single model obtains 94.6% and 88.7% F1 on SQuAD 1.1 and 2.0, respectively. We also achieve a new state of the art on the OntoNotes coreference resolution task (79.6\% F1), strong performance on the TACRED relation extraction benchmark, and even show gains on GLUE.

Code Repositories

UnknownGenie/altered-BERT-KPE
pytorch
Mentioned in GitHub
wooseok-AI/Korean_e2e_CR_BERT
tf
Mentioned in GitHub
facebookresearch/SpanBERT
Official
pytorch
Mentioned in GitHub
zixinzeng-jennifer/spanbert_trans
pytorch
Mentioned in GitHub
mandarjoshi90/coref
tf
Mentioned in GitHub
amore-upf/masked-coreference
tf
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
coreference-resolution-on-ontonotesSpanBERT
F1: 79.6
linguistic-acceptability-on-colaSpanBERT
Accuracy: 64.3%
natural-language-inference-on-multinliSpanBERT
Matched: 88.1
natural-language-inference-on-qnliSpanBERT
Accuracy: 94.3%
natural-language-inference-on-rteSpanBERT
Accuracy: 79.0%
open-domain-question-answering-on-searchqaSpanBERT
F1: 84.8
paraphrase-identification-on-quora-questionSpanBERT
Accuracy: 89.5
F1: 71.9
question-answering-on-naturalqaSpanBERT
F1: 82.5
question-answering-on-newsqaSpanBERT
F1: 73.6
question-answering-on-squad11SpanBERT (single model)
EM: 88.8
F1: 94.6
Hardware Burden: 586G
question-answering-on-squad20SpanBERT
EM: 85.7
F1: 88.7
question-answering-on-squad20-devSpanBERT
F1: 86.8
question-answering-on-triviaqaSpanBERT
F1: 83.6
relation-classification-on-tacred-1SpanBERT
F1: 70.8
relation-extraction-on-re-tacredSpanBERT
F1: 85.3
relation-extraction-on-tacredSpanBERT-large
F1: 70.8
semantic-textual-similarity-on-mrpcSpanBERT
Accuracy: 90.9%
semantic-textual-similarity-on-sts-benchmarkSpanBERT
Pearson Correlation: 0.899
sentiment-analysis-on-sst-2-binarySpanBERT
Accuracy: 94.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
SpanBERT: Improving Pre-training by Representing and Predicting Spans | Papers | HyperAI