MizAR 60 for Mizar 50

Jakubův, Jan; Chvalovský, Karel; Goertzel, Zarathustra; Kaliszyk, Cezary; Olšák, Mirek; Piotrowski, Bartosz; Schulz, Stephan; Suda, Martin; Urban, Josef

doi:10.4230/LIPIcs.ITP.2023.19

Abstract

As a present to Mizar on its 50th anniversary, we develop an AI/TP system that automatically proves about 60% of the Mizar theorems in the hammer setting. We also automatically prove 75% of the Mizar theorems when the automated provers are helped by using only the premises used in the human-written Mizar proofs. We describe the methods and large-scale experiments leading to these results. This includes in particular the E and Vampire provers, their ENIGMA and Deepire learning modifications, a number of learning-based premise selection methods, and the incremental loop that interleaves growing a corpus of millions of ATP proofs with training increasingly strong AI/TP systems on them. We also present a selection of Mizar problems that were proved automatically.

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org. URL: https://www.tensorflow.org/.
Jesse Alama, Tom Heskes, Daniel Kühlwein, Evgeni Tsivtsivadze, and Josef Urban. Premise selection for mathematics by corpus analysis and kernel methods. J. Autom. Reasoning, 52(2):191-213, 2014. URL: https://doi.org/10.1007/s10817-013-9286-5.
Grzegorz Bancerek, Czeslaw Bylinski, Adam Grabowski, Artur Kornilowicz, Roman Matuszewski, Adam Naumowicz, and Karol Pak. The role of the Mizar Mathematical Library for interactive proof development in Mizar. J. Autom. Reason., 61(1-4):9-32, 2018. URL: https://doi.org/10.1007/s10817-017-9440-6.
Grzegorz Bancerek, Czeslaw Bylinski, Adam Grabowski, Artur Kornilowicz, Roman Matuszewski, Adam Naumowicz, Karol Pak, and Josef Urban. Mizar: State-of-the-art and beyond. In Manfred Kerber, Jacques Carette, Cezary Kaliszyk, Florian Rabe, and Volker Sorge, editors, Intelligent Computer Mathematics - International Conference, CICM 2015, Washington, DC, USA, July 13-17, 2015, Proceedings, volume 9150 of Lecture Notes in Computer Science, pages 261-279. Springer, 2015. URL: https://doi.org/10.1007/978-3-319-20615-8_17.
Grzegorz Bancerek and Piotr Rudnicki. A Compendium of Continuous Lattices in MIZAR. J. Autom. Reasoning, 29(3-4):189-224, 2002. URL: https://doi.org/10.1023/A:1021966832558.
Jasmin Christian Blanchette, David Greenaway, Cezary Kaliszyk, Daniel Kühlwein, and Josef Urban. A learning-based fact selector for Isabelle/HOL. J. Autom. Reasoning, 57(3):219-244, 2016. URL: https://doi.org/10.1007/s10817-016-9362-8.
Jasmin Christian Blanchette, Cezary Kaliszyk, Lawrence C. Paulson, and Josef Urban. Hammering towards QED. J. Formalized Reasoning, 9(1):101-148, 2016. URL: https://doi.org/10.6092/issn.1972-5787/4593.
Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16, pages 785-794, New York, NY, USA, 2016. ACM. URL: https://doi.org/10.1145/2939672.2939785.
Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Alessandro Moschitti, Bo Pang, and Walter Daelemans, editors, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pages 1724-1734. ACL, 2014. URL: https://doi.org/10.3115/v1/d14-1179.
Karel Chvalovský, Jan Jakubuv, Miroslav Olsák, and Josef Urban. Learning theorem proving components. In Anupam Das and Sara Negri, editors, Automated Reasoning with Analytic Tableaux and Related Methods - 30th International Conference, TABLEAUX 2021, Birmingham, UK, September 6-9, 2021, Proceedings, volume 12842 of Lecture Notes in Computer Science, pages 266-278. Springer, 2021. URL: https://doi.org/10.1007/978-3-030-86059-2_16.
Karel Chvalovský, Jan Jakubuv, Martin Suda, and Josef Urban. ENIGMA-NG: Efficient neural and gradient-boosted inference guidance for E. In Pascal Fontaine, editor, Automated Deduction - CADE 27 - 27th International Conference on Automated Deduction, Natal, Brazil, August 27-30, 2019, Proceedings, volume 11716 of Lecture Notes in Computer Science, pages 197-215. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-29436-6_12.
Richard Evans, David Saxton, David Amos, Pushmeet Kohli, and Edward Grefenstette. Can neural networks understand logical entailment? In International Conference on Learning Representations, 2018. URL: https://openreview.net/forum?id=SkZxCk-0Z.
Thibault Gauthier. Deep reinforcement learning for synthesizing functions in higher-order logic. In LPAR, volume 73 of EPiC Series in Computing, pages 230-248. EasyChair, 2020.
Thibault Gauthier, Cezary Kaliszyk, and Josef Urban. TacticToe: Learning to reason with HOL4 tactics. In Thomas Eiter and David Sands, editors, LPAR-21, 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Maun, Botswana, May 7-12, 2017, volume 46 of EPiC, pages 125-143. EasyChair, 2017. URL: https://doi.org/10.29007/ntlb.
Thibault Gauthier, Cezary Kaliszyk, Josef Urban, Ramana Kumar, and Michael Norrish. Tactictoe: Learning to prove with tactics. J. Autom. Reason., 65(2):257-286, 2021. URL: https://doi.org/10.1007/s10817-020-09580-x.
Thibault Gauthier, Miroslav Olsák, and Josef Urban. Alien coding. CoRR, abs/2301.11479, 2023.
Thibault Gauthier and Josef Urban. Learning program synthesis for integer sequences from scratch. CoRR, abs/2202.11908, 2022.
Zarathustra Goertzel, Jan Jakubův, and Josef Urban. ENIGMAWatch: ProofWatch meets ENIGMA. In Serenella Cerrito and Andrei Popescu, editors, Automated Reasoning with Analytic Tableaux and Related Methods, pages 374-388, Cham, 2019. Springer International Publishing.
Zarathustra Amadeus Goertzel. Make E smart again (short paper). In IJCAR (2), volume 12167 of Lecture Notes in Computer Science, pages 408-415. Springer, 2020.
Zarathustra Amadeus Goertzel, Karel Chvalovský, Jan Jakubuv, Miroslav Olsák, and Josef Urban. Fast and slow Enigmas and parental guidance. In FroCoS, volume 12941 of Lecture Notes in Computer Science, pages 173-191. Springer, 2021.
Zarathustra Amadeus Goertzel, Jan Jakubuv, Cezary Kaliszyk, Miroslav Olsák, Jelle Piepenbrock, and Josef Urban. The Isabelle ENIGMA. In ITP, volume 237 of LIPIcs, pages 16:1-16:21. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
Adam Grabowski, Artur Korniłowicz, and Adam Naumowicz. Mizar in a nutshell. J. Formalized Reasoning, 3(2):153-245, 2010.
Frank Hutter, Holger H. Hoos, Kevin Leyton-Brown, and Thomas Stützle. ParamILS: an automatic algorithm configuration framework. J. Artificial Intelligence Research, 36:267-306, October 2009.
Jan Jakubův and Josef Urban. Hierarchical invention of theorem proving strategies. AI Commun., 31(3):237-250, 2018. URL: https://doi.org/10.3233/AIC-180761.
Jan Jakubuv, Karel Chvalovský, Miroslav Olsák, Bartosz Piotrowski, Martin Suda, and Josef Urban. ENIGMA anonymous: Symbol-independent inference guiding machine (system description). In IJCAR (2), volume 12167 of Lecture Notes in Computer Science, pages 448-463. Springer, 2020.
Jan Jakubuv, Martin Suda, and Josef Urban. Automated invention of strategies and term orderings for vampire. In GCAI, volume 50 of EPiC Series in Computing, pages 121-133. EasyChair, 2017.
Jan Jakubuv and Josef Urban. BliStrTune: hierarchical invention of theorem proving strategies. In Yves Bertot and Viktor Vafeiadis, editors, Proceedings of the 6th ACM SIGPLAN Conference on Certified Programs and Proofs, CPP 2017, Paris, France, January 16-17, 2017, pages 43-52. ACM, 2017. URL: https://doi.org/10.1145/3018610.3018619.
Jan Jakubuv and Josef Urban. Enhancing ENIGMA given clause guidance. In Florian Rabe, William M. Farmer, Grant O. Passmore, and Abdou Youssef, editors, Intelligent Computer Mathematics - 11th International Conference, CICM 2018, Hagenberg, Austria, August 13-17, 2018, Proceedings, volume 11006 of Lecture Notes in Computer Science, pages 118-124. Springer, 2018. URL: https://doi.org/10.1007/978-3-319-96812-4_11.
Jan Jakubuv and Josef Urban. Hammering Mizar by learning clause guidance. In John Harrison, John O'Leary, and Andrew Tolmach, editors, 10th International Conference on Interactive Theorem Proving, ITP 2019, September 9-12, 2019, Portland, OR, USA, volume 141 of LIPIcs, pages 34:1-34:8. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019. URL: https://doi.org/10.4230/LIPIcs.ITP.2019.34.
Cezary Kaliszyk and Josef Urban. Stronger automation for Flyspeck by feature weighting and strategy evolution. In Jasmin Christian Blanchette and Josef Urban, editors, PxTP 2013, volume 14 of EPiC Series, pages 87-95. EasyChair, 2013.
Cezary Kaliszyk and Josef Urban. Learning-assisted automated reasoning with Flyspeck. J. Autom. Reasoning, 53(2):173-213, 2014. URL: https://doi.org/10.1007/s10817-014-9303-3.
Cezary Kaliszyk and Josef Urban. MizAR 40 for Mizar 40. J. Autom. Reasoning, 55(3):245-256, 2015. URL: https://doi.org/10.1007/s10817-015-9330-8.
Cezary Kaliszyk, Josef Urban, and Jirí Vyskocil. Efficient semantic features for automated reasoning over large theories. In IJCAI, pages 3084-3090. AAAI Press, 2015.
Cezary Kaliszyk, Josef Urban, and Jirí Vyskocil. Automating formalization by statistical and semantic parsing of mathematics. In ITP, volume 10499 of Lecture Notes in Computer Science, pages 12-27. Springer, 2017.
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. Lightgbm: A highly efficient gradient boosting decision tree. In NIPS, pages 3146-3154, 2017.
Artur Korniłowicz and Christoph Schwarzweller. Computers and algorithms in Mizar. Mechanized Mathematics and Its Applications, 4(1):43-50, 2005.
Laura Kovács and Andrei Voronkov. First-order theorem proving and Vampire. In Natasha Sharygina and Helmut Veith, editors, CAV, volume 8044 of LNCS, pages 1-35. Springer, 2013. URL: https://doi.org/10.1007/978-3-642-39799-8_1.
Guillaume Lample and François Charton. Deep learning for symbolic mathematics. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. URL: https://openreview.net/forum?id=S1eZYeHFDS.
Miroslav Olsák, Cezary Kaliszyk, and Josef Urban. Property invariant embedding for automated reasoning. In Giuseppe De Giacomo, Alejandro Catalá, Bistra Dilkina, Michela Milano, Senén Barro, Alberto Bugarín, and Jérôme Lang, editors, ECAI 2020 - 24th European Conference on Artificial Intelligence, volume 325 of Frontiers in Artificial Intelligence and Applications, pages 1395-1402. IOS Press, 2020. URL: https://doi.org/10.3233/FAIA200244.
David W. Opitz and Richard Maclin. Popular ensemble methods: An empirical study. J. Artif. Intell. Res., 11:169-198, 1999. URL: https://doi.org/10.1613/jair.614.
Bartosz Piotrowski and Josef Urban. ATPboost: Learning premise selection in binary setting with ATP feedback. In Didier Galmiche, Stephan Schulz, and Roberto Sebastiani, editors, Automated Reasoning - 9th International Joint Conference, IJCAR 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, volume 10900 of Lecture Notes in Computer Science, pages 566-574. Springer, 2018. URL: https://doi.org/10.1007/978-3-319-94205-6_37.
Bartosz Piotrowski and Josef Urban. Stateful premise selection by recurrent neural networks. In LPAR, volume 73 of EPiC Series in Computing, pages 409-422. EasyChair, 2020.
Bartosz Piotrowski, Josef Urban, Chad E. Brown, and Cezary Kaliszyk. Can neural networks learn symbolic rewriting? CoRR, abs/1911.04873, 2019.
John Alan Robinson and Andrei Voronkov, editors. Handbook of Automated Reasoning (in 2 volumes). Elsevier and MIT Press, 2001. URL: https://www.sciencedirect.com/book/9780444508133/handbook-of-automated-reasoning.
Stephan Schulz. E - A Brainiac Theorem Prover. AI Commun., 15(2-3):111-126, 2002. URL: http://iospress.metapress.com/content/n908n94nmvk59v3c/.
Stephan Schulz. Simple and efficient clause subsumption with feature vector indexing. In Automated Reasoning and Mathematics, volume 7788 of Lecture Notes in Computer Science, pages 45-67. Springer, 2013.
Stephan Schulz. System description: E 1.8. In Kenneth L. McMillan, Aart Middeldorp, and Andrei Voronkov, editors, LPAR, volume 8312 of LNCS, pages 735-743. Springer, 2013. URL: https://doi.org/10.1007/978-3-642-45221-5_49.
Stephan Schulz. System Description: E 1.8. In Ken McMillan, Aart Middeldorp, and Andrei Voronkov, editors, Proc. of the 19th LPAR, Stellenbosch, volume 8312 of LNCS, pages 735-743. Springer, 2013.
Stephan Schulz, Simon Cruanes, and Petar Vukmirović. Faster, higher, stronger: E 2.3. In Pascal Fontaine, editor, Proc. of the 27th CADE, Natal, Brasil, number 11716 in LNAI, pages 495-507. Springer, 2019.
Stephan Schulz and Geoff Sutcliffe. Proof generation for saturating first-order theorem provers. In David Delahaye and Bruno Woltzenlogel Paleo, editors, All about Proofs, Proofs for All, volume 55 of Mathematical Logic and Foundations, pages 45-61. College Publications, London, UK, January 2015.
Martin Suda. Improving ENIGMA-style clause selection while learning from history. In André Platzer and Geoff Sutcliffe, editors, Automated Deduction - CADE 28 - 28th International Conference on Automated Deduction, Virtual Event, July 12-15, 2021, Proceedings, volume 12699 of Lecture Notes in Computer Science, pages 543-561. Springer, 2021. URL: https://doi.org/10.1007/978-3-030-79876-5_31.
Martin Suda. Vampire with a brain is a good ITP hammer. In Boris Konev and Giles Reger, editors, Frontiers of Combining Systems - 13th International Symposium, FroCoS 2021, Birmingham, UK, September 8-10, 2021, Proceedings, volume 12941 of Lecture Notes in Computer Science, pages 192-209. Springer, 2021. URL: https://doi.org/10.1007/978-3-030-86205-3_11.
Tanel Tammet. Towards efficient subsumption. In CADE, volume 1421 of Lecture Notes in Computer Science, pages 427-441. Springer, 1998.
J. Urban. Translating Mizar for First Order Theorem Provers. In A. Asperti, B. Buchberger, and J.H. Davenport, editors, Proceedings of the 2nd International Conference on Mathematical Knowledge Management, number 2594 in LNCS, pages 203-215. Springer, 2003.
Josef Urban. MPTP - Motivation, Implementation, First Experiments. J. Autom. Reasoning, 33(3-4):319-339, 2004. URL: https://doi.org/10.1007/s10817-004-6245-1.
Josef Urban. MPTP 0.2: Design, implementation, and initial experiments. J. Autom. Reasoning, 37(1-2):21-43, 2006. URL: https://doi.org/10.1007/s10817-006-9032-3.
Josef Urban. BliStr: The Blind Strategymaker. In Georg Gottlob, Geoff Sutcliffe, and Andrei Voronkov, editors, Global Conference on Artificial Intelligence, GCAI 2015, Tbilisi, Georgia, October 16-19, 2015, volume 36 of EPiC Series in Computing, pages 312-319. EasyChair, 2015. URL: http://www.easychair.org/publications/paper/BliStr_The_Blind_Strategymaker, URL: https://doi.org/10.29007/8n7m.
Josef Urban and Jan Jakubuv. First neural conjecturing datasets and experiments. In Christoph Benzmüller and Bruce R. Miller, editors, Intelligent Computer Mathematics - 13th International Conference, CICM 2020, Bertinoro, Italy, July 26-31, 2020, Proceedings, volume 12236 of Lecture Notes in Computer Science, pages 315-323. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-53518-6_24.
Josef Urban, Geoff Sutcliffe, Petr Pudlák, and Jiří Vyskočil. MaLARea SG1 - Machine Learner for Automated Reasoning with Semantic Guidance. In IJCAR, pages 441-456, 2008. URL: https://doi.org/10.1007/978-3-540-71070-7_37.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. CoRR, abs/1706.03762, 2017. URL: https://arxiv.org/abs/1706.03762.
Qingxiang Wang, Cezary Kaliszyk, and Josef Urban. First experiments with neural translation of informal to formal mathematics. In Florian Rabe, William M. Farmer, Grant O. Passmore, and Abdou Youssef, editors, 11th International Conference on Intelligent Computer Mathematics (CICM 2018), volume 11006 of LNCS, pages 255-270. Springer, 2018. URL: https://doi.org/10.1007/978-3-319-96812-4_22.

MizAR 60 for Mizar 50

Authors Jan Jakubův , Karel Chvalovský , Zarathustra Goertzel , Cezary Kaliszyk , Mirek Olšák , Bartosz Piotrowski , Stephan Schulz , Martin Suda , Josef Urban

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message

MizAR 60 for Mizar 50

Authors Jan Jakubův , Karel Chvalovský , Zarathustra Goertzel , Cezary Kaliszyk , Mirek Olšák , Bartosz Piotrowski , Stephan Schulz , Martin Suda , Josef Urban

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

Acknowledgements

Supplementary Materials

References

Thanks for your feedback!

Could not send message