HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

Albert Q. Jiang Sean Welleck Jin Peng Zhou Wenda Li Jiacheng Liu Mateja Jamnik Timothée Lacroix Yuhuai Wu Guillaume Lample

Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

Abstract

The formalization of existing mathematical proofs is a notoriously difficult process. Despite decades of research on automation and proof assistants, writing formal proofs remains arduous and only accessible to a few experts. While previous studies to automate formalization focused on powerful search algorithms, no attempts were made to take advantage of available informal proofs. In this work, we introduce Draft, Sketch, and Prove (DSP), a method that maps informal proofs to formal proof sketches, and uses the sketches to guide an automated prover by directing its search to easier sub-problems. We investigate two relevant setups where informal proofs are either written by humans or generated by a language model. Our experiments and ablation studies show that large language models are able to produce well-structured formal sketches that follow the same reasoning steps as the informal proofs. Guiding an automated prover with these sketches enhances its performance from 20.9% to 39.3% on a collection of mathematical competition problems.

Code Repositories

rah4927/lean-dojo-mew
Mentioned in GitHub
facebookresearch/minif2f
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
automated-theorem-proving-on-minif2f-testDSP (540B Minerva informal)
ITP: Isabelle
Pass@100: 38.9
cumulative: 38.9
automated-theorem-proving-on-minif2f-testSledgehammer + heuristics
ITP: Isabelle
Pass@1: 20.9
cumulative: 20.9
automated-theorem-proving-on-minif2f-validDSP (62B Minerva informal)
Pass@100: 43.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs | Papers | HyperAI