| UL2 20B (fine-tuned) | 98.1 | UL2: Unifying Language Learning Paradigms | |
| ST-MoE-32B 269B (fine-tuned) | 96.6 | ST-MoE: Designing Stable and Transferable Sparse Expert Models | |
| Flan-T5 XXL (zero -shot) | 89.82 | Scaling Instruction-Finetuned Language Models | |
| FLAN 137B (prompt-tuned) | 86.5 | Finetuned Language Models Are Zero-Shot Learners | |
| TTTTT 3B (fine-tuned) | 84.6 | TTTTTackling WinoGrande Schemas | - |