Command Palette
Search for a command to run...
Yifan Zhang Jingqin Yang Yang Yuan Andrew Chi-Chih Yao

Abstract
Recent advancements in large language models (LLMs) have shown remarkable progress, yet their ability to solve complex problems remains limited. In this work, we introduce Cumulative Reasoning (CR), an approach that utilizes LLMs cumulatively and iteratively, mirroring human thought processes for problem-solving. CR decomposes tasks into smaller, manageable components and leverages previous propositions for effective composition, significantly enhancing problem-solving capabilities. We demonstrate CR's advantage through several complex reasoning tasks: it outperforms existing methods in logical inference tasks with up to a 9.3% improvement, achieving 98.04% accuracy on the curated FOLIO wiki dataset. In the Game of 24, it achieves 98% accuracy, marking a 24% improvement over the prior state-of-the-art. In solving MATH problems, CR achieves a 4.2% increase from previous methods and a 43% relative improvement in the most challenging level 5 problems. When incorporating a code environment with CR, we further harness LLMs' reasoning capabilities and outperform the Program of Thought (PoT) method by 38.8%. The code is available at https://github.com/iiis-ai/cumulative-reasoning.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| math-word-problem-solving-on-math | CR (GPT-4-turbo model, w/ code) | Accuracy: 72.2 |
| math-word-problem-solving-on-math | CR (GPT-4 model, w/o code) | Accuracy: 58.0 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.