| PaLM 2 (few‑shot, CoT, SC) | 90.4 | PaLM 2 Technical Report | |
| Albert Lan et al. (2020) (ensemble) | 76.5 | ALBERT: A Lite BERT for Self-supervised Learning of Language Representations | |
| STaR without Rationalization (on GPT-J) | 68.8 | STaR: Bootstrapping Reasoning With Reasoning | |