HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

HyperTree Proof Search for Neural Theorem Proving

Guillaume Lample Marie-Anne Lachaux Thibaut Lavril Xavier Martinet Amaury Hayat Gabriel Ebner Aurélien Rodriguez Timothée Lacroix

HyperTree Proof Search for Neural Theorem Proving

Abstract

We propose an online training procedure for a transformer-based automated theorem prover. Our approach leverages a new search algorithm, HyperTree Proof Search (HTPS), inspired by the recent success of AlphaZero. Our model learns from previous proof searches through online training, allowing it to generalize to domains far from the training distribution. We report detailed ablations of our pipeline's main components by studying performance on three environments of increasing complexity. In particular, we show that with HTPS alone, a model trained on annotated proofs manages to prove 65.4% of a held-out set of Metamath theorems, significantly outperforming the previous state of the art of 56.5% by GPT-f. Online training on these unproved theorems increases accuracy to 82.6%. With a similar computational budget, we improve the state of the art on the Lean-based miniF2F-curriculum dataset from 31% to 42% proving accuracy.

Benchmarks

BenchmarkMethodologyMetrics
automated-theorem-proving-on-metamath-setmmEvariste
Pass@32: 72.4
automated-theorem-proving-on-minif2f-1Evariste
Pass@64: 32.1
automated-theorem-proving-on-minif2f-1Evariste-1d
Pass@64: 33.6
automated-theorem-proving-on-minif2f-1GPT-f
Pass@64: 30.6
automated-theorem-proving-on-minif2f-1Evariste-7d
Pass@64: 42.5
automated-theorem-proving-on-minif2f-testEvariste
ITP: Lean
Pass@64: 41
cumulative: 41
automated-theorem-proving-on-minif2f-testGPT-f
ITP: Metamath
Pass@64: 36.6
cumulative: 36.6
automated-theorem-proving-on-minif2f-testEvariste-1d
ITP: Lean
Pass@64: 38.9
cumulative: 38.9
automated-theorem-proving-on-minif2f-testEvariste-7d
ITP: Lean
Pass@64: 40.6
cumulative: 40.6
automated-theorem-proving-on-minif2f-validGPT-f
Pass@64: 47.3
automated-theorem-proving-on-minif2f-validEvariste-1d
Pass@64: 46.7
automated-theorem-proving-on-minif2f-validEvariste-7d
Pass@64: 47.5
automated-theorem-proving-on-minif2f-validEvariste
Pass@64: 58.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
HyperTree Proof Search for Neural Theorem Proving | Papers | HyperAI