Question Answering On Openbookqa

评估指标

Accuracy

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
GPT-4 + knowledge base95.9--
MVP-Tuning (ensemble)95.2--
PaLM 540B (Self Improvement, Self Consistency)94.4Large Language Models Can Self-Improve-
X-Reasoner94.2--
PaLM 540B (Self Improvement, CoT Prompting)93Large Language Models Can Self-Improve-
PaLM 540B (Self Improvement, Standard-Prompting)92Large Language Models Can Self-Improve-
DeBERTa-xxlarge 1.5B + MVP-Tuning91.3--
GrapeQA: PEGA+CANP90GrapeQA: GRaph Augmentation and Pruning to Enhance Question-Answering-
PaLM 540B (Self Consistency)90Large Language Models Can Self-Improve-
GenMC 11B89.8Clues Before Answers: Generation-Enhanced Multiple-Choice QA
AristoRoBERTa + MVP-Tuning87.6--
AristoRoBERTa + Graph Soft Counter87.4GNN is a Counter? Revisiting GNN for Question Answering-
UnifiedQA 11B87.2UnifiedQA: Crossing Format Boundaries With a Single QA System
LLaMA-3 8B+MoSLoRA86.8Mixture-of-Subspaces in Low-Rank Adaptation
PaLM 540B (CoT Prompting)86.4Large Language Models Can Self-Improve-
LLaMA-3 8B + MixLoRA84.8MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
PaLM 540B (Standard-Prompting)84.4Large Language Models Can Self-Improve-
TTTTT 3B83.2Fusing Context Into Knowledge Graph for Commonsense Question Answering
LLaMA-2 13B + MixLoRA83MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
QA-GNN82.8QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering
0 of 45 row(s) selected.
Question Answering On Openbookqa | SOTA | HyperAI超神经