
摘要
我们提出了一种面向多模态与检索增强型问答(QA)的“动作链”(Chain-of-Action, CoA)框架。相较于现有文献,CoA有效解决了当前QA应用面临的两大核心挑战:(i)与实时或领域事实不符的虚假幻觉问题;(ii)在处理复合信息时推理能力薄弱的问题。本研究的核心贡献在于提出一种新颖的推理-检索机制,该机制通过系统性提示(systematic prompting)与预设动作,将复杂问题分解为一系列可执行的推理链。在方法论层面,我们设计了三种可适应不同领域的“即插即用”(Plug-and-Play)动作,用于从异构数据源中检索实时信息。此外,我们引入了一种多参考可信度评分(Multi-Reference Faith Score, MRFS),用以验证答案并解决答案间的冲突。实验方面,我们结合公开基准数据集与一个Web3领域的案例研究,充分验证了CoA相较于现有方法在性能上的优越性。
代码仓库
MAGICS-LAB/Chain-of-Actions
官方
pytorch
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| question-answering-on-fever | CoA w/o actions | EM: 54.2 |
| question-answering-on-fever | Self-Ask | EM: 64.2 |
| question-answering-on-fever | Zero-shot | EM: 50 |
| question-answering-on-fever | CoA | EM: 68.9 |
| question-answering-on-fever | DSP | EM: 62.2 |
| question-answering-on-strategyqa | SearchChain | EM: 77 |
| question-answering-on-strategyqa | CoA w/o actions | EM: 70.6 |
| question-answering-on-strategyqa | CoA | EM: 79.2 |
| question-answering-on-strategyqa | Least-to-Most | EM: 65.8 |
| question-answering-on-truthfulqa | CoA | EM: 67.3 |
| question-answering-on-truthfulqa | CoA w/o actions | EM: 63.3 |
| question-answering-on-webquestions | ToT | EM: 26.3 |
| question-answering-on-webquestions | DSP | EM: 59.4 |
| question-answering-on-webquestions | CoT | EM: 42.5 |
| question-answering-on-webquestions | Self-Ask | EM: 31.1 |
| question-answering-on-webquestions | CoA | EM: 70.7 |
| question-answering-on-webquestions | React | EM: 38.3 |
| question-answering-on-webquestions | Zero-shot | EM: 43 |
| question-answering-on-webquestions | CoA w/o actions | EM: 64.7 |
| question-answering-on-webquestions | Few-shot | EM: 44.7 |