Question Answering On Squad11
评估指标
EM
F1
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | |||
|---|---|---|---|---|
| {ANNA} (single model) | 90.622 | 95.719 | - | - |
| LUKE 483M | - | 95.4 | LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention | |
| LUKE (single model) | 90.202 | 95.379 | LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention | |
| LUKE (single model) | 90.202 | 95.379 | - | - |
| XLNet (single model) | 89.898 | 95.080 | XLNet: Generalized Autoregressive Pretraining for Language Understanding | |
| XLNet (single model) | 89.898 | 95.080 | - | - |
| XLNET-123 (single model) | 89.646 | 94.930 | - | - |
| XLNET-123++ (single model) | 89.856 | 94.903 | - | - |
| XLNET-123+ (single model) | 89.709 | 94.859 | - | - |
| SpanBERT (single model) | 88.839 | 94.635 | - | - |
| SpanBERT (single model) | 88.8 | 94.6 | SpanBERT: Improving Pre-training by Representing and Predicting Spans | |
| BERTSP (single model) | 88.912 | 94.584 | - | - |
| Unnamed submission by NMC | 88.912 | 94.584 | - | - |
| BERT+WWM+MT (single model) | 88.650 | 94.393 | - | - |
| Tuned BERT-1seq Large Cased (single model) | 87.465 | 93.294 | - | - |
| BERT-LARGE (Ensemble+TriviaQA) | 87.4 | 93.2 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | |
| BERT (ensemble) | 87.433 | 93.160 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | |
| BART (TextBox 2.0) | - | 93.04 | TextBox 2.0: A Text Generation Library with Pre-trained Language Models | |
| LinkBERT (large) | 87.45 | 92.7 | LinkBERT: Pretraining Language Models with Document Links | |
| BERT+MT (single model) | 86.458 | 92.645 | - | - |
0 of 213 row(s) selected.