Command Palette
Search for a command to run...
Nils Reimers; Iryna Gurevych

Abstract
BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations (~65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering. In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT. We evaluate SBERT and SRoBERTa on common STS tasks and transfer learning tasks, where it outperforms other state-of-the-art sentence embeddings methods.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| semantic-textual-similarity-on-sick | SRoBERTa-NLI-large | Spearman Correlation: 0.7429 |
| semantic-textual-similarity-on-sick | SRoBERTa-NLI-base | Spearman Correlation: 0.7446 |
| semantic-textual-similarity-on-sick | SBERT-NLI-base | Spearman Correlation: 0.7291 |
| semantic-textual-similarity-on-sick | SBERT-NLI-large | Spearman Correlation: 0.7375 |
| semantic-textual-similarity-on-sick | SentenceBERT | Spearman Correlation: 0.7462 |
| semantic-textual-similarity-on-sts-benchmark | SRoBERTa-NLI-STSb-large | Spearman Correlation: 0.8615 |
| semantic-textual-similarity-on-sts-benchmark | SBERT-NLI-base | Spearman Correlation: 0.7703 |
| semantic-textual-similarity-on-sts-benchmark | SRoBERTa-NLI-base | Spearman Correlation: 0.7777 |
| semantic-textual-similarity-on-sts-benchmark | SBERT-NLI-large | Spearman Correlation: 0.79 |
| semantic-textual-similarity-on-sts-benchmark | SBERT-STSb-base | Spearman Correlation: 0.8479 |
| semantic-textual-similarity-on-sts-benchmark | SBERT-STSb-large | Spearman Correlation: 0.8445 |
| semantic-textual-similarity-on-sts12 | SRoBERTa-NLI-large | Spearman Correlation: 0.7453 |
| semantic-textual-similarity-on-sts13 | SBERT-NLI-large | Spearman Correlation: 0.7846 |
| semantic-textual-similarity-on-sts14 | SBERT-NLI-large | Spearman Correlation: 0.7490000000000001 |
| semantic-textual-similarity-on-sts15 | SRoBERTa-NLI-large | Spearman Correlation: 0.8185 |
| semantic-textual-similarity-on-sts16 | SRoBERTa-NLI-large | Spearman Correlation: 0.7682 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.