| language-modelling-on-big-bench-lite | GLM-130B (0-shot) | |
| language-modelling-on-big-bench-lite | GLM-130B (3-shot) | |
| language-modelling-on-big-bench-lite | GLM-130B (1-shot) | |
| language-modelling-on-clue-afqmc | ERNIE 3.0 Titan-260B | |
| language-modelling-on-clue-afqmc | GLM-130B | |
| language-modelling-on-clue-c3 | ERNIE 3.0 Titan-260B | |
| language-modelling-on-clue-c3 | GLM-130B | |
| language-modelling-on-clue-cmnli | ERNIE 3.0 Titan-260B | |
| language-modelling-on-clue-cmnli | GLM-130B | |
| language-modelling-on-clue-cmrc2018 | GLM-130B | |
| language-modelling-on-clue-cmrc2018 | ERNIE 3.0 Titan-260B | |
| language-modelling-on-clue-drcd | ERNIE 3.0 Titan-260B | |
| language-modelling-on-clue-drcd | GLM-130B | |
| language-modelling-on-clue-ocnli-50k | GLM-130B | |
| language-modelling-on-clue-ocnli-50k | ERNIE 3.0 Titan-260B | |
| language-modelling-on-clue-wsc1-1 | ERNIE 3.0 Titan-260B | |
| language-modelling-on-clue-wsc1-1 | GLM-130B | |
| language-modelling-on-fewclue-bustm | GLM-130B | |
| language-modelling-on-fewclue-bustm | ERNIE 3.0 Titan-260B | |
| language-modelling-on-fewclue-chid-fc | GLM-130B | |
| language-modelling-on-fewclue-chid-fc | ERNIE 3.0 Titan-260B | |
| language-modelling-on-fewclue-cluewsc-fc | GLM-130B | |
| language-modelling-on-fewclue-cluewsc-fc | ERNIE 3.0 Titan-260B | |
| language-modelling-on-fewclue-eprstmt | ERNIE 3.0 Titan-260B | |
| language-modelling-on-fewclue-eprstmt | GLM-130B | |
| language-modelling-on-fewclue-ocnli-fc | GLM-130B | |
| language-modelling-on-fewclue-ocnli-fc | ERNIE 3.0 Titan-260B | |
| language-modelling-on-lambada | GLM-130B (bidirectional attention) | |
| language-modelling-on-the-pile | Jurassic-1 | |
| language-modelling-on-the-pile | GLM-130B | |
| language-modelling-on-the-pile | GPT-3 | |
| long-context-understanding-on-ada-leval | ChatGLM3-6b-32k | 12k: 0.9 16k: 0.5 1k: 39.8 2k: 18.8 4k: 9.0 6k: 5.0 8k: 3.4 |
| long-context-understanding-on-ada-leval | ChatGLM2-6b-32k | 12k: 0.0 16k: 0.3 1k: 31.2 2k: 10.9 4k: 4.5 6k: 1.6 8k: 1.6 |
| long-context-understanding-on-ada-leval-tsort | ChatGLM2-6b-32k | 16k: 0.9 2k: 0.9 4k: 0.2 8k: 0.7 |
| long-context-understanding-on-ada-leval-tsort | ChatGLM3-6b-32k | 16k: 0.7 2k: 2.3 4k: 2.4 8k: 2.0 |
| multi-task-language-understanding-on-mmlu | GLM-130B | |