Question Answering On Coqa
评估指标
Overall
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | ||
|---|---|---|---|
| GPT-3 175B (few-shot, k=32) | 85 | Language Models are Few-Shot Learners | |
| BERT Large Augmented (single model) | 81.1 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | |
| SDNet (ensemble) | 79.3 | SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering | |
| BERT-base finetune (single model) | 78.1 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | |
| SDNet (single model) | 76.6 | SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering | |
| FlowQA (single model) | 75.0 | FlowQA: Grasping Flow in History for Conversational Machine Comprehension | |
| BiDAF++ (single model) | 67.8 | A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC | |
| DrQA + seq2seq with copy attention (single model) | 65.1 | CoQA: A Conversational Question Answering Challenge | |
| Vanilla DrQA (single model) | 52.6 | CoQA: A Conversational Question Answering Challenge | 
0 of 9 row(s) selected.