| abstractive-text-summarization-on-cnn-daily | T5 | ROUGE-1: 43.52 ROUGE-2: 21.55 ROUGE-L: 40.69 |
| answer-generation-on-weibopolls | T5 | BLEU-1: 37.77 BLEU-3: 25.86 ROUGE-1: 46.20 ROUGE-L: 43.32 |
| common-sense-reasoning-on-record | T5-XXL 11B (fine-tuned) | |
| common-sense-reasoning-on-record | T5-11B | |
| coreference-resolution-on-winograd-schema | T5-XXL 11B (fine-tuned) | |
| document-summarization-on-cnn-daily-mail | T5-11B | ROUGE-1: 43.52 ROUGE-2: 21.55 ROUGE-L: 40.69 |
| linguistic-acceptability-on-cola | T5-Base | |
| linguistic-acceptability-on-cola | T5-XL 3B | |
| linguistic-acceptability-on-cola | T5-Small | |
| linguistic-acceptability-on-cola | T5-Large 770M | |
| linguistic-acceptability-on-cola | T5-11B | |
| machine-translation-on-wmt2014-english-french | T5 | |
| machine-translation-on-wmt2014-english-german | T5-11B | BLEU score: 32.1 Number of Params: 11110M |
| multimodal-intent-recognition-on-photochat | T5-3B | F1: 58.9 Precision: 54.1 Recall: 64.6 |
| multimodal-intent-recognition-on-photochat | T5-base | F1: 58.1 Precision: 58.2 Recall: 57.9 |
| natural-language-inference-on-commitmentbank | T5-XXL 11B (fine-tuned) | |
| natural-language-inference-on-commitmentbank | T5-Large 770M (fine-tuned) | |
| natural-language-inference-on-commitmentbank | T5-Base 220M (fine-tuned) | |
| natural-language-inference-on-multinli | T5-Base | Matched: 87.1 Mismatched: 86.2 |
| natural-language-inference-on-multinli | T5-3B | Matched: 91.4 Mismatched: 91.2 |
| natural-language-inference-on-multinli | T5-11B | |
| natural-language-inference-on-multinli | T5-XXL 11B (fine-tuned) | |
| natural-language-inference-on-multinli | T5-Large 770M | |
| natural-language-inference-on-multinli | T5-Small | Matched: 82.4 Mismatched: 82.3 |
| natural-language-inference-on-multinli | T5-Large | |
| natural-language-inference-on-qnli | T5-Small | |
| natural-language-inference-on-qnli | T5-Base | |
| natural-language-inference-on-qnli | T5-11B | |
| natural-language-inference-on-qnli | T5-Large 770M | |
| natural-language-inference-on-qnli | T5-3B | |
| natural-language-inference-on-rte | T5-Large 770M | |
| natural-language-inference-on-rte | T5-Base 220M | |
| natural-language-inference-on-rte | T5-XL 3B | |
| natural-language-inference-on-rte | T5-XXL 11B (fine-tuned) | |
| natural-language-inference-on-rte | T5-Small | |
| natural-language-inference-on-wnli | T5-Base 220M | |
| natural-language-inference-on-wnli | T5-Large 770M | |
| natural-language-inference-on-wnli | T5-XL 3B | |
| natural-language-inference-on-wnli | T5-Small 60M | |
| natural-language-inference-on-wnli | T5-XXL 11B | |
| poll-generation-on-weibopolls | T5 | BLEU-1: 37.34 BLEU-3: 21.06 ROUGE-1: 45.33 ROUGE-L: 42.69 |
| question-answering-on-boolq | T5-Small 60M (fine-tuned) | |
| question-answering-on-boolq | T5-Base 220M (fine-tuned) | |
| question-answering-on-boolq | T5-XXL 11B (fine-tuned) | |
| question-answering-on-boolq | T5-Large 770M (fine-tuned) | |
| question-answering-on-copa | T5-XL 3B (fine-tuned) | |
| question-answering-on-copa | T5-XXL 11B (fine-tuned) | |
| question-answering-on-copa | T5-Large 770M (fine-tuned) | |
| question-answering-on-copa | T5-Base 220M (fine-tuned) | |
| question-answering-on-multirc | T5-XXL 11B (fine-tuned) | |
| question-answering-on-multirc | T5-11B | |
| question-answering-on-quora-question-pairs | T5-11B | |
| question-answering-on-quora-question-pairs | T5-Small | |
| question-answering-on-quora-question-pairs | T5-Base | |
| question-answering-on-quora-question-pairs | T5-3B | |
| question-answering-on-quora-question-pairs | T5-Large 770M | |
| question-answering-on-squad11-dev | T5-3B | |
| question-answering-on-squad11-dev | T5-Small | |
| question-answering-on-squad11-dev | T5-Base | |
| question-answering-on-squad11-dev | T5-11B | |
| question-answering-on-squad11-dev | T5-Large 770M | |
| question-answering-on-webquestions | T5.1.1-XXL+SSM | |
| question-generation-on-weibopolls | T5 | BLEU-1: 36.91 BLEU-3: 16.26 ROUGE-1: 44.46 ROUGE-L: 42.06 |
| semantic-parsing-on-webquestionssp | T5-11B (Raffel et al., 2020) | |
| semantic-textual-similarity-on-mrpc | T5-3B | |
| semantic-textual-similarity-on-mrpc | T5-Large | |
| semantic-textual-similarity-on-mrpc | T5-Small | |
| semantic-textual-similarity-on-mrpc | T5-11B | |
| semantic-textual-similarity-on-mrpc | T5-Base | |
| semantic-textual-similarity-on-sts-benchmark | T5-Large 770M | Spearman Correlation: 0.886 |
| semantic-textual-similarity-on-sts-benchmark | T5-Small | Pearson Correlation: 0.856 Spearman Correlation: 0.85 |
| semantic-textual-similarity-on-sts-benchmark | T5-11B | Pearson Correlation: 0.925 Spearman Correlation: 0.921 |
| semantic-textual-similarity-on-sts-benchmark | T5-Base | Pearson Correlation: 0.894 |
| semantic-textual-similarity-on-sts-benchmark | T5-Large | Pearson Correlation: 0.899 |
| semantic-textual-similarity-on-sts-benchmark | T5-3B | Pearson Correlation: 0.906 Spearman Correlation: 0.898 |
| sentiment-analysis-on-sst-2-binary | T5-11B | |
| sentiment-analysis-on-sst-2-binary | T5-3B | |
| sentiment-analysis-on-sst-2-binary | T5-Large 770M | |
| sentiment-analysis-on-sst-2-binary | T5-Small | |
| sentiment-analysis-on-sst-2-binary | T5-Base | |
| word-sense-disambiguation-on-words-in-context | T5-XXL 11B | |