Interactive Evaluation Of Dialog On Dstc9
评估指标
Coherent
Consistent
Diversity
Error Recovery
Flexible
Informative
Inquisitive
Likeable
Overall Human Rating
Topic Depth
Understanding
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PLATO-2 | 2.8017 | 0.9390 | 2.7441 | 2.7518 | 2.8000 | 2.7881 | 2.7949 | 2.7878 | 4.15 | 2.7678 | 2.8285 | A Unified Pre-training Framework for Conversational AI |
0 of 1 row(s) selected.