Command Palette
Search for a command to run...
Niklas Muennighoff

Abstract
Decoder transformers have continued increasing in scale reaching hundreds of billions of parameters. Due to their scale the same decoder sets state-of-the-art results on various language tasks via prompting or fine-tuning. Yet, these large foundation models remain unusable for the related fields of semantic search and sentence embeddings. This prevents possibly new state-of-the-art results and forces organizations to train and maintain separate models. To this end, we propose SGPT to use decoders for sentence embeddings and semantic search via prompting or fine-tuning. At 5.8 billion parameters SGPT improves on the previously best sentence embeddings by a margin of 7% and outperforms a concurrent method with 175 billion parameters as measured on the BEIR search benchmark. Code, models and result files are freely available at https://github.com/Muennighoff/sgpt.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| information-retrieval-on-cqadupstack | SGPT-BE-5.8B | mAP@100: 0.160 |
| passage-retrieval-on-msmarco-beir | SGPT-CE-6.1B | nDCG@10: 0.290 |
| passage-retrieval-on-msmarco-beir | SGPT-BE-5.8B | nDCG@10: 0.399 |
| passage-retrieval-on-msmarco-beir | SGPT-CE-2.7B | nDCG@10: 0.278 |
| question-answering-on-fiqa-2018-beir | SGPT-BE-5.8B | nDCG@10: 0.372 |
| question-answering-on-fiqa-2018-beir | SGPT-CE-6.1B | nDCG@10: 0.401 |
| question-answering-on-hotpotqa-beir | SGPT-CE-6.1B | nDCG@10: 0.699 |
| question-answering-on-hotpotqa-beir | SGPT-BE-5.8B | nDCG@10: 0.593 |
| question-answering-on-nq-beir | SGPT-BE-5.8B | nDCG@10: 0.524 |
| question-answering-on-nq-beir | SGPT-CE-6.1B | nDCG@10: 0.401 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.