Slot Filling On Kilt T Rex
评估指标
Accuracy
F1
KILT-AC
KILT-F1
R-Prec
Recall@5
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | |||||||
|---|---|---|---|---|---|---|---|---|
| Re2G | 87.68 | 89.93 | 75.84 | 77.05 | 80.7 | 89.0 | Re2G: Retrieve, Rerank, Generate | |
| KGI_1 | 84.36 | 87.24 | 69.14 | 70.58 | 74.36 | 83.14 | - | - |
| single ngram | 83.72 | 86.53 | 60.08 | 61.72 | 67.8 | 81.52 | - | - |
| Wikipedia | 81.34 | 84.46 | 64.64 | 66.64 | 75.64 | 87.57 | - | - |
| MetaRAG | 78.66 | 81.71 | 61.88 | 63.09 | 66.36 | 76.24 | - | - |
| KGI_0 (reupload) | 77.9 | 81.31 | 55.54 | 56.79 | 59.7 | 70.38 | - | - |
| RAG | 59.2 | 62.96 | 23.12 | 23.94 | 28.68 | 33.04 | - | - |
| BART + DPR | 59.16 | 62.76 | 11.12 | 11.41 | 13.26 | 17.04 | - | - |
| Sphere | 57.02 | 61.46 | 0.0 | 0.0 | 0.0 | 0.0 | - | - |
| 10k | 53.9 | 61.74 | 27.84 | 32.34 | 37.62 | 40.07 | - | - |
| DensePhrases | 53.9 | 61.74 | 27.84 | 32.34 | 37.62 | 40.07 | Learning Dense Representations of Phrases at Scale | |
| Coop. DistilBert | 49.04 | 54.62 | 36.68 | 39.57 | 48.08 | 51.86 | - | - |
| BART | 45.06 | 49.24 | 0.0 | 0.0 | 0.0 | 0.0 | - | - |
| T5-base | 43.56 | 50.61 | 0.0 | 0.0 | 0.0 | 0.0 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
| multi-task small | 19.3 | 25.81 | 0.0 | 0.0 | 0.0 | 0.0 | - | - |
| GENRE | 0.1 | 7.67 | 0.04 | 6.66 | 79.42 | 85.33 | - | - |
| JivBest | 0.02 | 2.04 | 0.0 | 0.0 | 0.0 | 0.0 | - | - |
| Multi-task DPR | 0.0 | 0.0 | 0.0 | 0.0 | 69.46 | 83.88 | - | - |
| TABi | 0.0 | 0.0 | 0.0 | 0.0 | 81.9 | 89.36 | - | - |
| chriskuei | 0.0 | 0.0 | 0.0 | 0.0 | 79.98 | 85.75 | - | - |
0 of 20 row(s) selected.