Command Palette
Search for a command to run...
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Zhenyu Pan Haozheng Luo Manling Li Han Liu

Abstract
We present a Chain-of-Action (CoA) framework for multimodal and retrieval-augmented Question-Answering (QA). Compared to the literature, CoA overcomes two major challenges of current QA applications: (i) unfaithful hallucination that is inconsistent with real-time or domain facts and (ii) weak reasoning performance over compositional information. Our key contribution is a novel reasoning-retrieval mechanism that decomposes a complex question into a reasoning chain via systematic prompting and pre-designed actions. Methodologically, we propose three types of domain-adaptable `Plug-and-Play' actions for retrieving real-time information from heterogeneous sources. We also propose a multi-reference faith score (MRFS) to verify and resolve conflicts in the answers. Empirically, we exploit both public benchmarks and a Web3 case study to demonstrate the capability of CoA over other methods.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| question-answering-on-fever | CoA w/o actions | EM: 54.2 |
| question-answering-on-fever | Self-Ask | EM: 64.2 |
| question-answering-on-fever | Zero-shot | EM: 50 |
| question-answering-on-fever | CoA | EM: 68.9 |
| question-answering-on-fever | DSP | EM: 62.2 |
| question-answering-on-strategyqa | SearchChain | EM: 77 |
| question-answering-on-strategyqa | CoA w/o actions | EM: 70.6 |
| question-answering-on-strategyqa | CoA | EM: 79.2 |
| question-answering-on-strategyqa | Least-to-Most | EM: 65.8 |
| question-answering-on-truthfulqa | CoA | EM: 67.3 |
| question-answering-on-truthfulqa | CoA w/o actions | EM: 63.3 |
| question-answering-on-webquestions | ToT | EM: 26.3 |
| question-answering-on-webquestions | DSP | EM: 59.4 |
| question-answering-on-webquestions | CoT | EM: 42.5 |
| question-answering-on-webquestions | Self-Ask | EM: 31.1 |
| question-answering-on-webquestions | CoA | EM: 70.7 |
| question-answering-on-webquestions | React | EM: 38.3 |
| question-answering-on-webquestions | Zero-shot | EM: 43 |
| question-answering-on-webquestions | CoA w/o actions | EM: 64.7 |
| question-answering-on-webquestions | Few-shot | EM: 44.7 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.