Command Palette
Search for a command to run...
Question Answering On Drop
Metrics
Accuracy
Results
Performance results of various models on this benchmark
| Paper Title | Repository | ||
|---|---|---|---|
| PaLM 540B (Self Improvement, Self Consistency) | 83 | Large Language Models Can Self-Improve | - |
| PaLM 540B (Self Consistency) | 78.2 | Large Language Models Can Self-Improve | - |
| PaLM 540B (Self Improvement, CoT Prompting) | 76.2 | Large Language Models Can Self-Improve | - |
| PaLM 540B (Self Improvement, Standard-Prompting) | 71.7 | Large Language Models Can Self-Improve | - |
| PaLM 540B (CoT Prompting) | 70.6 | Large Language Models Can Self-Improve | - |
| PaLM 540B (Standard-Prompting) | 60 | Large Language Models Can Self-Improve | - |
0 of 6 row(s) selected.