Common Sense Reasoning On Rwsd

评估指标

Accuracy

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
Human Benchmark0.84RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
SBERT_Large_mt_ru_finetuning0.675--
RuBERT conversational0.669--
ruT5-large-finetune0.669--
Multilingual Bert0.669--
MT5 Large0.669mT5: A massively multilingual pre-trained text-to-text transformer
RuGPT3Medium0.669--
heuristic majority0.669Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks-
YaLM 1.0B few-shot0.669--
RuGPT3Small0.669--
ruBert-large finetune0.669--
ruBert-base finetune0.669--
majority_class0.669Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks-
RuBERT plain0.669--
ruT5-base-finetune0.669--
Baseline TF-IDF1.10.662RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
SBERT_Large0.662--
RuGPT3XL few-shot0.649--
RuGPT3Large0.636--
Random weighted0.597Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks-
0 of 22 row(s) selected.
Common Sense Reasoning On Rwsd | SOTA | HyperAI超神经