Question Answering On Story Cloze
评估指标
Accuracy
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | ||
|---|---|---|---|
| Neo-6B (QA + WS) | 87.8 | Ask Me Anything: A simple strategy for prompting language models | |
| GPT-3 175B (Few-Shot) | 87.7 | Language Models are Few-Shot Learners | |
| PaLM 2-L (one-shot) | 87.4 | PaLM 2 Technical Report | |
| PaLM 2-M (one-shot) | 86.7 | PaLM 2 Technical Report | |
| PaLM 2-S (one-shot) | 85.6 | PaLM 2 Technical Report | |
| Neo-6B (QA) | 76.3 | Ask Me Anything: A simple strategy for prompting language models | |
| Neo-6B (few-shot) | 51.0 | Ask Me Anything: A simple strategy for prompting language models |
0 of 7 row(s) selected.