Open Domain Question Answering On Kilt 1
评估指标
EM
F1
KILT-EM
KILT-F1
R-Prec
Recall@5
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | |||||||
|---|---|---|---|---|---|---|---|---|
| intersect | 40.46 | 51.44 | 18.06 | 21.42 | 58.83 | 51.03 | - | - |
| Wikipedia | 36.9 | 47.66 | 11.71 | 13.88 | 45.38 | 35.75 | - | - |
| Multitask DPR + BART | 31.77 | 41.56 | 9.53 | 11.27 | 42.92 | 28.39 | - | - |
| Sphere | 31.64 | 41.55 | 0.0 | 0.0 | 0.0 | 0.0 | - | - |
| RAG | 26.97 | 36.03 | 3.21 | 4.1 | 30.59 | 12.59 | - | - |
| BART + DPR | 25.18 | 34.07 | 1.96 | 2.53 | 25.04 | 10.4 | - | - |
| BART | 15.37 | 21.97 | 0.0 | 0.0 | 0.0 | 0.0 | - | - |
| T5-base | 12.64 | 19.57 | 0.0 | 0.0 | 0.0 | 0.0 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
| BERT + DPR | 11.29 | 17.35 | 0.74 | 1.26 | 25.04 | 10.4 | - | - |
| multi-task small | 3.29 | 6.84 | 0.0 | 0.0 | 0.0 | 0.0 | - | - |
| Multi-task DPR | 0.0 | 0.0 | 0.0 | 0.0 | 42.92 | 28.39 | - | - |
| GENRE | 0.0 | 0.0 | 0.0 | 0.0 | 51.27 | 34.03 | - | - |
| TABi | 0.0 | 0.0 | 0.0 | 0.0 | 53.12 | 35.48 | - | - |
| chriskuei | 0.0 | 0.0 | 0.0 | 0.0 | 51.8 | 34.57 | - | - |
0 of 14 row(s) selected.