Fact Verification On Kilt Fever
评估指标
Accuracy
KILT-AC
R-Prec
Recall@5
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | |||||
|---|---|---|---|---|---|---|
| Re2G | 89.55 | 78.53 | 88.92 | 92.52 | Re2G: Retrieve, Rerank, Generate | |
| intersect | 89.54 | 71.28 | 81.45 | 89.56 | - | - |
| Sphere | 89.12 | 0.0 | 0.0 | 0.0 | - | - |
| Wikipedia | 88.99 | 65.68 | 74.77 | 87.89 | - | - |
| aa_evalai | 88.45 | 0.0 | 0.0 | 0.0 | - | - |
| BART + DPR | 86.74 | 47.68 | 55.33 | 74.29 | - | - |
| Multitask DPR + BART | 86.32 | 63.94 | 74.48 | 87.52 | - | - |
| RAG | 86.31 | 53.45 | 61.94 | 75.55 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
| KGI | 85.58 | 64.41 | 75.6 | 84.95 | - | - |
| BART | 78.93 | 0.0 | 0.0 | 0.0 | - | - |
| T5-base | 76.3 | 0.0 | 0.0 | 0.0 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
| GENRE+roBERTa finetuning | 76.26 | 0.0 | 0.0 | 0.0 | - | - |
| SVM with rbf kernel | 72.34 | 0.0 | 0.0 | 0.0 | - | - |
| ElefPav | 71.58 | 0.0 | 0.0 | 0.0 | - | - |
| Alessandro_Tansel | 71.42 | 0.0 | 0.0 | 0.0 | - | - |
| JuanTran | 71.38 | 0.0 | 0.0 | 0.0 | - | - |
| Logistic Regression | 71.24 | 0.0 | 0.0 | 0.0 | - | - |
| QDA | 71.12 | 0.0 | 0.0 | 0.0 | - | - |
| SVM | 70.71 | 0.0 | 0.0 | 0.0 | - | - |
| stupidTeam | 69.71 | 0.0 | 0.0 | 0.0 | - | - |
0 of 33 row(s) selected.