The Isabelle ENIGMA

Goertzel, Zarathustra A.; Jakubův, Jan; Kaliszyk, Cezary; Olšák, Miroslav; Piepenbrock, Jelle; Urban, Josef

doi:10.4230/LIPIcs.ITP.2022.16

Abstract

We significantly improve the performance of the E automated theorem prover on the Isabelle Sledgehammer problems by combining learning and theorem proving in several ways. In particular, we develop targeted versions of the ENIGMA guidance for the Isabelle problems, targeted versions of neural premise selection, and targeted strategies for E. The methods are trained in several iterations over hundreds of thousands untyped and typed first-order problems extracted from Isabelle. Our final best single-strategy ENIGMA and premise selection system improves the best previous version of E by 25.3% in 15 seconds, outperforming also all other previous ATP and SMT systems.

Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2019.
Jesse Alama, Tom Heskes, Daniel Kühlwein, Evgeni Tsivtsivadze, and Josef Urban. Premise selection for mathematics by corpus analysis and kernel methods. J. Autom. Reasoning, 52(2):191-213, 2014. URL: https://doi.org/10.1007/s10817-013-9286-5.
Alexander A. Alemi, François Chollet, Niklas Eén, Geoffrey Irving, Christian Szegedy, and Josef Urban. DeepMath - deep sequence models for premise selection. In Daniel D. Lee, Masashi Sugiyama, Ulrike V. Luxburg, Isabelle Guyon, and Roman Garnett, editors, Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, pages 2235-2243, 2016. URL: http://papers.nips.cc/paper/6280-deepmath-deep-sequence-models-for-premise-selection.
Lasse Blaauwbroek, Josef Urban, and Herman Geuvers. The tactician - A seamless, interactive tactic learner and prover for coq. In CICM, volume 12236 of Lecture Notes in Computer Science, pages 271-277. Springer, 2020.
Jasmin Christian Blanchette, Sascha Böhme, Andrei Popescu, and Nicholas Smallbone. Encoding monomorphic and polymorphic types. In Nir Piterman and Scott A. Smolka, editors, TACAS, volume 7795 of LNCS, pages 493-507. Springer, 2013. URL: https://doi.org/10.1007/978-3-642-36742-7_34.
Jasmin Christian Blanchette, Max W. Haslbeck, Daniel Matichuk, and Tobias Nipkow. Mining the archive of formal proofs. In Manfred Kerber, Jacques Carette, Cezary Kaliszyk, Florian Rabe, and Volker Sorge, editors, Intelligent Computer Mathematics - International Conference, CICM 2015, Washington, DC, USA, July 13-17, 2015, Proceedings, volume 9150 of Lecture Notes in Computer Science, pages 3-17. Springer, 2015. URL: https://doi.org/10.1007/978-3-319-20615-8_1.
Jasmin Christian Blanchette, Cezary Kaliszyk, Lawrence C. Paulson, and Josef Urban. Hammering towards QED. J. Formalized Reasoning, 9(1):101-148, 2016. URL: https://doi.org/10.6092/issn.1972-5787/4593.
Sascha Böhme and Tobias Nipkow. Sledgehammer: Judgement Day. In Jürgen Giesl and Reiner Hähnle, editors, IJCAR, volume 6173 of LNCS, pages 107-121. Springer, 2010. URL: https://doi.org/10.1007/978-3-642-14203-1_9.
Chad E. Brown, Thibault Gauthier, Cezary Kaliszyk, Geoff Sutcliffe, and Josef Urban. GRUNGE: A grand unified ATP challenge. In CADE, volume 11716 of Lecture Notes in Computer Science, pages 123-141. Springer, 2019.
Karel Chvalovský, Jan Jakubův, Martin Suda, and Josef Urban. ENIGMA-NG: efficient neural and gradient-boosted inference guidance for E. In Pascal Fontaine, editor, Automated Deduction - CADE 27 - 27th International Conference on Automated Deduction, Natal, Brazil, August 27-30, 2019, Proceedings, volume 11716 of Lecture Notes in Computer Science, pages 197-215. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-29436-6_12.
Martin Desharnais, Petar Vukmirović, Jasmin Blanchette, and Makarius Wenzel. Seventeen provers under the hammer, 2022. URL: https://matryoshka-project.github.io/pubs/seventeen.pdf.
Michael Färber and Cezary Kaliszyk. Random forests for premise selection. In Carsten Lutz and Silvio Ranise, editors, Frontiers of Combining Systems - 10th International Symposium, FroCoS 2015, Wroclaw, Poland, September 21-24, 2015. Proceedings, volume 9322 of Lecture Notes in Computer Science, pages 325-340. Springer, 2015. URL: https://doi.org/10.1007/978-3-319-24246-0_20.
Thibault Gauthier, Cezary Kaliszyk, Josef Urban, Ramana Kumar, and Michael Norrish. Tactictoe: Learning to prove with tactics. J. Autom. Reason., 65(2):257-286, 2021.
Zarathustra Goertzel, Jan Jakubův, and Josef Urban. Enigmawatch: Proofwatch meets ENIGMA. In Serenella Cerrito and Andrei Popescu, editors, Automated Reasoning with Analytic Tableaux and Related Methods - 28th International Conference, TABLEAUX 2019, London, UK, September 3-5, 2019, Proceedings, volume 11714 of Lecture Notes in Computer Science, pages 374-388. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-29026-9_21.
Zarathustra Amadeus Goertzel. Make E smart again (short paper). In IJCAR (2), volume 12167 of Lecture Notes in Computer Science, pages 408-415. Springer, 2020.
Zarathustra Amadeus Goertzel, Karel Chvalovský, Jan Jakubuv, Miroslav Olsák, and Josef Urban. Fast and slow enigmas and parental guidance. In Boris Konev and Giles Reger, editors, Frontiers of Combining Systems - 13th International Symposium, FroCoS 2021, Birmingham, UK, September 8-10, 2021, Proceedings, volume 12941 of Lecture Notes in Computer Science, pages 173-191. Springer, 2021. URL: https://doi.org/10.1007/978-3-030-86205-3_10.
John Harrison. HOL Light: A tutorial introduction. In Mandayam K. Srivas and Albert John Camilleri, editors, FMCAD, volume 1166 of LNCS, pages 265-269. Springer, 1996. URL: https://doi.org/10.1007/BFb0031814.
John Harrison. Optimizing Proof Search in Model Elimination. In M. McRobbie and J.K. Slaney, editors, Proceedings of the 13th International Conference on Automated Deduction, number 1104 in LNAI, pages 313-327. Springer, 1996.
Krystof Hoder and Andrei Voronkov. Sine qua non for large theory reasoning. In Nikolaj Bjørner and Viorica Sofronie-Stokkermans, editors, CADE, volume 6803 of LNCS, pages 299-314. Springer, 2011. URL: https://doi.org/10.1007/978-3-642-22438-6_23.
Edvard K. Holden and Konstantin Korovin. Heterogeneous heuristic optimisation and scheduling for first-order theorem proving. In CICM, volume 12833 of Lecture Notes in Computer Science, pages 107-123. Springer, 2021.
R. J. M. Hughes. Super-combinators a new implementation method for applicative languages. In Proceedings of the 1982 ACM Symposium on LISP and Functional Programming, LFP '82, pages 1-10, New York, NY, USA, 1982. Association for Computing Machinery. URL: https://doi.org/10.1145/800068.802129.
Jan Jakubův, Karel Chvalovský, Miroslav Olsák, Bartosz Piotrowski, Martin Suda, and Josef Urban. ENIGMA anonymous: Symbol-independent inference guiding machine (system description). In Nicolas Peltier and Viorica Sofronie-Stokkermans, editors, Automated Reasoning - 10th International Joint Conference, IJCAR 2020, Paris, France, July 1-4, 2020, Proceedings, Part II, volume 12167 of Lecture Notes in Computer Science, pages 448-463. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-51054-1_29.
Jan Jakubův and Josef Urban. BliStrTune: hierarchical invention of theorem proving strategies. In Yves Bertot and Viktor Vafeiadis, editors, Proceedings of the 6th ACM SIGPLAN Conference on Certified Programs and Proofs, CPP 2017, Paris, France, January 16-17, 2017, pages 43-52. ACM, 2017. URL: https://doi.org/10.1145/3018610.3018619.
Jan Jakubův and Josef Urban. ENIGMA: efficient learning-based inference guiding machine. In Herman Geuvers, Matthew England, Osman Hasan, Florian Rabe, and Olaf Teschke, editors, Intelligent Computer Mathematics - 10th International Conference, CICM 2017, Edinburgh, UK, July 17-21, 2017, Proceedings, volume 10383 of Lecture Notes in Computer Science, pages 292-302. Springer, 2017. URL: https://doi.org/10.1007/978-3-319-62075-6_20.
Jan Jakubův and Josef Urban. Enhancing ENIGMA given clause guidance. In Florian Rabe, William M. Farmer, Grant O. Passmore, and Abdou Youssef, editors, Intelligent Computer Mathematics - 11th International Conference, CICM 2018, Hagenberg, Austria, August 13-17, 2018, Proceedings, volume 11006 of Lecture Notes in Computer Science, pages 118-124. Springer, 2018. URL: https://doi.org/10.1007/978-3-319-96812-4_11.
Jan Jakubův and Josef Urban. Hammering Mizar by learning clause guidance. In John Harrison, John O'Leary, and Andrew Tolmach, editors, 10th International Conference on Interactive Theorem Proving, ITP 2019, September 9-12, 2019, Portland, OR, USA, volume 141 of LIPIcs, pages 34:1-34:8. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019. URL: https://doi.org/10.4230/LIPIcs.ITP.2019.34.
Cezary Kaliszyk and Josef Urban. FEMaLeCoP: Fairly efficient machine learning connection prover. In Martin Davis, Ansgar Fehnker, Annabelle McIver, and Andrei Voronkov, editors, Logic for Programming, Artificial Intelligence, and Reasoning - 20th International Conference, LPAR-20 2015, Suva, Fiji, November 24-28, 2015, Proceedings, volume 9450 of Lecture Notes in Computer Science, pages 88-96. Springer, 2015. URL: https://doi.org/10.1007/978-3-662-48899-7_7.
Cezary Kaliszyk and Josef Urban. MizAR 40 for Mizar 40. J. Autom. Reasoning, 55(3):245-256, 2015. URL: https://doi.org/10.1007/s10817-015-9330-8.
Cezary Kaliszyk, Josef Urban, Henryk Michalewski, and Miroslav Olsák. Reinforcement learning of theorem proving. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montréal, Canada., pages 8836-8847, 2018. URL: http://papers.nips.cc/paper/8098-reinforcement-learning-of-theorem-proving.
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. Lightgbm: A highly efficient gradient boosting decision tree. In NIPS, pages 3146-3154, 2017.
Konstantin Korovin. iprover - an instantiation-based theorem prover for first-order logic (system description). In Alessandro Armando, Peter Baumgartner, and Gilles Dowek, editors, Automated Reasoning, 4th International Joint Conference, IJCAR 2008, Sydney, Australia, August 12-15, 2008, Proceedings, volume 5195 of Lecture Notes in Computer Science, pages 292-298. Springer, 2008. URL: https://doi.org/10.1007/978-3-540-71070-7_24.
Laura Kovács and Andrei Voronkov. First-order theorem proving and Vampire. In Natasha Sharygina and Helmut Veith, editors, CAV, volume 8044 of LNCS, pages 1-35. Springer, 2013. URL: https://doi.org/10.1007/978-3-642-39799-8_1.
Daniel Kühlwein, Twan van Laarhoven, Evgeni Tsivtsivadze, Josef Urban, and Tom Heskes. Overview and evaluation of premise selection techniques for large theory mathematics. In Bernhard Gramlich, Dale Miller, and Uli Sattler, editors, IJCAR, volume 7364 of LNCS, pages 378-392. Springer, 2012. URL: https://doi.org/10.1007/978-3-642-31365-3_30.
Julio César López-Hernández and Konstantin Korovin. An abstraction-refinement framework for reasoning with large theories. In IJCAR, volume 10900 of Lecture Notes in Computer Science, pages 663-679. Springer, 2018.
Jia Meng and Lawrence C. Paulson. Lightweight relevance filtering for machine-generated resolution problems. J. Applied Logic, 7(1):41-57, 2009. URL: https://doi.org/10.1016/j.jal.2007.07.004.
Yutaka Nagashima and Ramana Kumar. A proof strategy language and proof script generation for isabelle/hol. In CADE, volume 10395 of Lecture Notes in Computer Science, pages 528-545. Springer, 2017.
Miroslav Olsák, Cezary Kaliszyk, and Josef Urban. Property invariant embedding for automated reasoning. In Giuseppe De Giacomo, Alejandro Catalá, Bistra Dilkina, Michela Milano, Senén Barro, Alberto Bugarín, and Jérôme Lang, editors, ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020 - Including 10th Conference on Prestigious Applications of Artificial Intelligence (PAIS 2020), volume 325 of Frontiers in Artificial Intelligence and Applications, pages 1395-1402. IOS Press, 2020. URL: https://doi.org/10.3233/FAIA200244.
Ross A. Overbeek. A new class of automated theorem-proving algorithms. J. ACM, 21(2):191-200, April 1974. URL: https://doi.org/10.1145/321812.321814.
Bartosz Piotrowski and Josef Urban. ATPboost: Learning premise selection in binary setting with ATP feedback. In Didier Galmiche, Stephan Schulz, and Roberto Sebastiani, editors, Automated Reasoning - 9th International Joint Conference, IJCAR 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, volume 10900 of Lecture Notes in Computer Science, pages 566-574. Springer, 2018. URL: https://doi.org/10.1007/978-3-319-94205-6_37.
Bartosz Piotrowski and Josef Urban. Stateful premise selection by recurrent neural networks. In LPAR, volume 73 of EPiC Series in Computing, pages 409-422. EasyChair, 2020.
Michael Rawson and Giles Reger. lazycop: Lazy paramodulation meets neurally guided search. In TABLEAUX, volume 12842 of Lecture Notes in Computer Science, pages 187-199. Springer, 2021.
Simon Schäfer and Stephan Schulz. Breeding theorem proving heuristics with genetic algorithms. In Georg Gottlob, Geoff Sutcliffe, and Andrei Voronkov, editors, Global Conference on Artificial Intelligence, GCAI 2015, Tbilisi, Georgia, October 16-19, 2015, volume 36 of EPiC Series in Computing, pages 263-274. EasyChair, 2015. URL: http://www.easychair.org/publications/paper/Breeding_Theorem_Proving_Heuristics_with_Genetic_Algorithms, URL: https://doi.org/10.29007/gms9.
Stephan Schulz. System description: E 1.8. In Kenneth L. McMillan, Aart Middeldorp, and Andrei Voronkov, editors, LPAR, volume 8312 of LNCS, pages 735-743. Springer, 2013. URL: https://doi.org/10.1007/978-3-642-45221-5_49.
Stephan Schulz, Simon Cruanes, and Petar Vukmirovic. Faster, higher, stronger: E 2.3. In Pascal Fontaine, editor, Automated Deduction - CADE 27 - 27th International Conference on Automated Deduction, Natal, Brazil, August 27-30, 2019, Proceedings, volume 11716 of Lecture Notes in Computer Science, pages 495-507. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-29436-6_29.
René Thiemann and Christian Sternagel. Certification of termination proofs using ceta. In Stefan Berghofer, Tobias Nipkow, Christian Urban, and Makarius Wenzel, editors, TPHOLs, volume 5674 of Lecture Notes in Computer Science, pages 452-468. Springer, 2009. URL: https://doi.org/10.1007/978-3-642-03359-9.
Josef Urban. MPTP - Motivation, Implementation, First Experiments. J. Autom. Reasoning, 33(3-4):319-339, 2004. URL: https://doi.org/10.1007/s10817-004-6245-1.
Josef Urban. BliStr: The Blind Strategymaker. In Georg Gottlob, Geoff Sutcliffe, and Andrei Voronkov, editors, Global Conference on Artificial Intelligence, GCAI 2015, Tbilisi, Georgia, October 16-19, 2015, volume 36 of EPiC Series in Computing, pages 312-319. EasyChair, 2015. URL: http://www.easychair.org/publications/paper/BliStr_The_Blind_Strategymaker, URL: https://doi.org/10.29007/8n7m.
Josef Urban. ERC project AI4Reason final scientific report, 2021. URL: http://grid01.ciirc.cvut.cz/~mptp/ai4reason/PR_CORE_SCIENTIFIC_4.pdf.
Robert Veroff. Using hints to increase the effectiveness of an automated reasoning program: Case studies. J. Autom. Reasoning, 16(3):223-239, 1996. URL: https://doi.org/10.1007/BF00252178.
Zsolt Zombori, Josef Urban, and Miroslav Olsák. The role of entropy in guiding a connection prover. In TABLEAUX, volume 12842 of Lecture Notes in Computer Science, pages 218-235. Springer, 2021.

The Isabelle ENIGMA

Authors Zarathustra A. Goertzel , Jan Jakubův , Cezary Kaliszyk , Miroslav Olšák , Jelle Piepenbrock , Josef Urban

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

The Isabelle ENIGMA

Authors Zarathustra A. Goertzel , Jan Jakubův , Cezary Kaliszyk , Miroslav Olšák , Jelle Piepenbrock , Josef Urban

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

Supplementary Materials

References

Thanks for your feedback!

Could not send message