Automated Theorem Proving On Minif2F Valid
评估指标
Pass@64
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | ||
|---|---|---|---|
| Evariste | 58.6 | HyperTree Proof Search for Neural Theorem Proving | - |
| Evariste-7d | 47.5 | HyperTree Proof Search for Neural Theorem Proving | - |
| GPT-f | 47.3 | HyperTree Proof Search for Neural Theorem Proving | - |
| Evariste-1d | 46.7 | HyperTree Proof Search for Neural Theorem Proving | - |
| Metamath GPT-f | - | MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics | |
| Lean GPT-f | - | MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics | |
| Lean tidy | - | MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics | |
| DSP (62B Minerva informal) | - | Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs | |
| Lyra + GPT-4 | - | Lyra: Orchestrating Dual Correction in Automated Theorem Proving | |
| LEGO-Prover ChatGPT | - | LEGO-Prover: Neural Theorem Proving with Growing Libraries |
0 of 10 row(s) selected.