Natural Language Inference On Lidirus
评估指标
MCC
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | ||
|---|---|---|---|
| Human Benchmark | 0.626 | RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark | |
| ruRoberta-large finetune | 0.339 | - | - |
| ruT5-large-finetune | 0.32 | - | - |
| ruT5-base-finetune | 0.267 | - | - |
| ruBert-large finetune | 0.235 | - | - |
| RuGPT3Large | 0.231 | - | - |
| ruBert-base finetune | 0.224 | - | - |
| SBERT_Large_mt_ru_finetuning | 0.218 | - | - |
| SBERT_Large | 0.209 | - | - |
| RuBERT plain | 0.191 | - | - |
| Multilingual Bert | 0.189 | - | - |
| RuBERT conversational | 0.178 | - | - |
| heuristic majority | 0.147 | Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | - |
| YaLM 1.0B few-shot | 0.124 | - | - |
| RuGPT3XL few-shot | 0.096 | - | - |
| MT5 Large | 0.061 | mT5: A massively multilingual pre-trained text-to-text transformer | |
| Baseline TF-IDF1.1 | 0.06 | RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark | |
| RuGPT3Medium | 0.01 | - | - |
| majority_class | 0 | Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | - |
| Random weighted | 0 | Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | - |
0 of 22 row(s) selected.