Common Sense Reasoning On Record

评估指标

EM
F1

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
Turing NLR v5 XXL 5.4B (fine-tuned)95.996.4Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE-
ST-MoE-32B 269B (fine-tuned)95.1-ST-MoE: Designing Stable and Transferable Sparse Expert Models
DeBERTa-1.5B94.194.5DeBERTa: Decoding-enhanced BERT with Disentangled Attention
PaLM 540B (finetuned) 94.094.6PaLM: Scaling Language Modeling with Pathways
Vega v2 6B (fine-tuned)93.994.4Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE-
T5-XXL 11B (fine-tuned)93.4-Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
GESA 500M91.792.2Integrating a Heterogeneous Graph with Entity-aware Self-attention using Relative Position Labels for Reading Comprehension Model-
LUKE-Graph91.291.5LUKE-Graph: A Transformer-based Approach with Gated Relational Graph Attention for Cloze-style Reading Comprehension-
LUKE (single model)90.64091.209--
LUKE 483M90.691.2LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
KELM (finetuning RoBERTa-large based single model)89.189.6KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs
ST-MoE-L 4.1B (fine-tuned)88.9-ST-MoE: Designing Stable and Transferable Sparse Expert Models
FLAN 137B (prompt-tuned)85.1-Finetuned Language Models Are Zero-Shot Learners
XLNet + MTL + Verifier (ensemble)83.09083.737--
GPT-3 Large 760M (0-shot)82.1-Language Models are Few-Shot Learners
CSRLM (single model)81.78082.584--
XLNet + Verifier81.582.7Pingan Smart Health and SJTU at COIN - Shared Task: utilizing Pre-trained Language Models and Common-sense Knowledge in Machine Reading Tasks-
XLNet + MTL + Verifier (single model)81.46082.664--
Switch Transformer 9B79.9-Efficient Language Modeling with Sparse all-MLP-
{SKG-NET} (single model)79.48080.038--
0 of 45 row(s) selected.
Common Sense Reasoning On Record | SOTA | HyperAI超神经