Verifying Peephole Rewriting in SSA Compiler IRs

Bhat, Siddharth; Keizer, Alex; Hughes, Chris; Goens, Andrés; Grosser, Tobias

doi:10.4230/LIPIcs.ITP.2024.9

Abstract

There is an increasing need for domain-specific reasoning in modern compilers. This has fueled the use of tailored intermediate representations (IRs) based on static single assignment (SSA), like in the MLIR compiler framework. Interactive theorem provers (ITPs) provide strong guarantees for the end-to-end verification of compilers (e.g., CompCert). However, modern compilers and their IRs evolve at a rate that makes proof engineering alongside them prohibitively expensive. Nevertheless, well-scoped push-button automated verification tools such as the Alive peephole verifier for LLVM-IR gained recognition in domains where SMT solvers offer efficient (semi) decision procedures. In this paper, we aim to combine the convenience of automation with the versatility of ITPs for verifying peephole rewrites across domain-specific IRs. We formalize a core calculus for SSA-based IRs that is generic over the IR and covers so-called regions (nested scoping used by many domain-specific IRs in the MLIR ecosystem). Our mechanization in the Lean proof assistant provides a user-friendly frontend for translating MLIR syntax into our calculus. We provide scaffolding for defining and verifying peephole rewrites, offering tactics to eliminate the abstraction overhead of our SSA calculus. We prove correctness theorems about peephole rewriting, as well as two classical program transformations. To evaluate our framework, we consider three use cases from the MLIR ecosystem that cover different levels of abstractions: (1) bitvector rewrites from LLVM, (2) structured control flow, and (3) fully homomorphic encryption. We envision that our mechanization provides a foundation for formally verified rewrites on new domain-specific IRs.

AV Aho, R Sethi, and JD Ullman. Compilers: Principles, Techniques, and Tools. Citeseer, 1985.
Andrew W Appel. SSA is functional programming. Acm Sigplan Notices, 33(4):17-20, 1998.
Seongwon Bang, Seunghyeon Nam, Inwhan Chun, Ho Young Jhoo, and Juneyoung Lee. SMT-based translation validation for machine learning compiler. In Sharon Shoham and Yakir Vizel, editors, Computer Aided Verification - 34th International Conference, CAV 2022, Haifa, Israel, August 7-10, 2022, Proceedings, Part II, volume 13372 of Lecture Notes in Computer Science, pages 386-407. Springer, 2022. URL: https://doi.org/10.1007/978-3-031-13188-2_19.
Clark Barrett, Aaron Stump, Cesare Tinelli, et al. The smt-lib standard: Version 2.0. In Proceedings of the 8th international workshop on satisfiability modulo theories (Edinburgh, UK), volume 13, page 14, 2010.
Gilles Barthe, Delphine Demange, and David Pichardie. Formal verification of an SSA-based middle-end for CompCert. ACM Transactions on Programming Languages and Systems (TOPLAS), 36(1):1-35, 2014.
Siddharth Bhat and Tobias Grosser. Lambda the ultimate ssa: optimizing functional programs in ssa. In 2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 1-11. IEEE, 2022.
Nathanaël Courant and Xavier Leroy. Verified code generation for the polyhedral model. Proceedings of the ACM on Programming Languages, 5(POPL):1-24, 2021.
Leonardo De Moura and Nikolaj Bjørner. Z3: An efficient SMT solver. In Tools and Algorithms for the Construction and Analysis of Systems: 14th International Conference, TACAS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29-April 6, 2008. Proceedings 14, pages 337-340. Springer, 2008.
Leonardo de Moura and Sebastian Ullrich. The Lean 4 theorem prover and programming language. In International Conference on Automated Deduction, pages 625-635. Springer, 2021.
Craig Gentry. A fully homomorphic encryption scheme. Stanford university, 2009.
Benjamin Grégoire and Assia Mahboubi. Proving equalities in a commutative ring done right in Coq. In Joe Hurd and Thomas F. Melham, editors, Theorem Proving in Higher Order Logics, 18th International Conference, TPHOLs 2005, Oxford, UK, August 22-25, 2005, Proceedings, volume 3603 of Lecture Notes in Computer Science, pages 98-113. Springer, 2005. URL: https://doi.org/10.1007/11541868_7.
Gérard Huet. The zipper. Journal of functional programming, 7(5):549-554, 1997.
Petter Källström and Oscar Gustafsson. Fast and area efficient adder for wide data in recent Xilinx FPGAs. In 2016 26th International Conference on Field Programmable Logic and Applications (FPL), pages 1-4. IEEE, 2016.
Jeehoon Kang, Yoonseung Kim, Chung-Kil Hur, Derek Dreyer, and Viktor Vafeiadis. Lightweight verification of separate compilation. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 178-190, 2016.
Richard A Kelsey. A correspondence between continuation passing style and static single assignment form. ACM SIGPLAN Notices, 30(3):13-22, 1995.
Jérémie Koenig and Zhong Shao. CompCertO: compiling certified open C components. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pages 1095-1109, 2021.
Chris Lattner and Vikram Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004., pages 75-86. IEEE, 2004.
Chris Lattner, Jacques Pienaar, Mehdi Amini, Uday Bondhugula, River Riddle, Albert Cohen, Tatiana Shpeisman, Andy Davis, Nicolas Vasilache, and Oleksandr Zinenko. MLIR: A compiler infrastructure for the end of Moore’s law. arXiv preprint arXiv:2002.11054, 2020.
Juneyoung Lee, Chung-Kil Hur, and Nuno P Lopes. AliveInLean: a verified LLVM peephole optimization verifier. In International Conference on Computer Aided Verification, pages 445-455. Springer, 2019.
Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-Kil Hur, Sanjoy Das, David Majnemer, John Regehr, and Nuno P Lopes. Taming undefined behavior in LLVM. ACM SIGPLAN Notices, 52(6):633-647, 2017.
Xavier Leroy, Sandrine Blazy, Daniel Kästner, Bernhard Schommer, Markus Pister, and Christian Ferdinand. CompCert-a formally verified optimizing compiler. In ERTS 2016: Embedded Real Time Software and Systems, 8th European Congress, 2016.
Liyi Li and Elsa L Gunter. K-LLVM: a relatively complete semantics of LLVM IR. In 34th European Conference on Object-Oriented Programming (ECOOP 2020), 2020.
Nuno P Lopes, Juneyoung Lee, Chung-Kil Hur, Zhengyang Liu, and John Regehr. Alive2: bounded translation validation for LLVM. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pages 65-79, 2021.
Nuno P Lopes, David Menendez, Santosh Nagarakatte, and John Regehr. Provably correct peephole optimizations with alive. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 22-32, 2015.
Carl McConnell and Ralph E. Johnson. Using static single assignment form in a code optimizer. ACM Lett. Program. Lang. Syst., 1(2):152-160, June 1992. URL: https://doi.org/10.1145/151333.151368.
William M McKeeman. Peephole optimization. Communications of the ACM, 8(7):443-444, 1965.
Manasij Mukherjee and John Regehr. Hydra: Generalizing peephole optimizations with program synthesis. Proceedings of the ACM on Programming Languages, (OOPSLA), 2024.
Eric Mullen, Daryl Zuniga, Zachary Tatlock, and Dan Grossman. Verified peephole optimizations for CompCert. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 448-461, 2016.
Georg Neis, Chung-Kil Hur, Jan-Oliver Kaiser, Craig McLaughlin, Derek Dreyer, and Viktor Vafeiadis. Pilsner: A compositionally verified compiler for a higher-order imperative language. In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming, pages 166-178, 2015.
Sunjae Park, Woosung Song, Seunghyeon Nam, Hyeongyu Kim, Junbum Shin, and Juneyoung Lee. HEaaN.MLIR: An optimizing compiler for fast ring-based homomorphic encryption. Proceedings of the 44th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2023.
Daniel Patterson and Amal Ahmed. The next 700 compiler correctness theorems (functional pearl). Proceedings of the ACM on Programming Languages, 3(ICFP):1-29, 2019.
Anurudh Peduri, Siddharth Bhat, and Tobias Grosser. QSSA: an SSA-based IR for quantum computing. In Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction, pages 2-14, 2022.
Benjamin C Pierce and C Benjamin. Types and programming languages. MIT press, 2002.
Fabrice Rastello and Florent Bouchez Tichadou. SSA-based Compiler Design. Springer Nature, 2022.
Michael Sammler, Simon Spies, Youngju Song, Emanuele D'Osualdo, Robbert Krebbers, Deepak Garg, and Derek Dreyer. Dimsum: A decentralized approach to multi-language semantics and verification. Proceedings of the ACM on Programming Languages, 7(POPL):775-805, 2023.
Youngju Song, Minki Cho, Dongjoo Kim, Yonghyun Kim, Jeehoon Kang, and Chung-Kil Hur. CompCertM: CompCert with C-assembly linking and lightweight modular verification. Proceedings of the ACM on Programming Languages, 4(POPL):1-31, 2019.
Gordon Stewart, Lennart Beringer, Santiago Cuellar, and Andrew W Appel. Compositional CompCert. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 275-287, 2015.
Sebastian Ullrich and Leonardo de Moura. Beyond notations: Hygienic macro expansion for theorem proving languages. In International Joint Conference on Automated Reasoning, pages 167-182. Springer, 2020.
Nicolas Vasilache, Oleksandr Zinenko, Aart J. C. Bik, Mahesh Ravishankar, Thomas Raoux, Alexander Belyaev, Matthias Springer, Tobias Gysi, Diego Caballero, Stephan Herhut, Stella Laurenzo, and Albert Cohen. Composable and modular code generation in MLIR: A structured and retargetable approach to tensor compiler construction. CoRR, abs/2202.03293, 2022. URL: https://arxiv.org/abs/2202.03293.
Alexander Viand, Patrick Jattke, Miro Haller, and Anwar Hithnawi. HECO: Automatic code optimizations for efficient fully homomorphic encryption. arXiv preprint arXiv:2202.01649, 2022.
Wei Wang and Xinming Huang. A novel fast modular multiplier architecture for 8,192-bit RSA cryposystem. In 2013 IEEE High Performance Extreme Computing Conference (HPEC), pages 1-5. IEEE, 2013.
Yuting Wang, Pierre Wilke, and Zhong Shao. An abstract stack based approach to verified compositional compilation to machine code. Proceedings of the ACM on Programming Languages, 3(POPL):1-30, 2019.
Dominik Winterer, Chengyu Zhang, and Zhendong Su. Validating SMT solvers via semantic fusion. In Alastair F. Donaldson and Emina Torlak, editors, Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020, pages 718-730. ACM, 2020. URL: https://doi.org/10.1145/3385412.3385985.
Li-yao Xia, Yannick Zakowski, Paul He, Chung-Kil Hur, Gregory Malecha, Benjamin C Pierce, and Steve Zdancewic. Interaction trees: representing recursive and impure programs in Coq. arXiv preprint arXiv:1906.00046, 2019.
Yannick Zakowski, Calvin Beck, Irene Yoon, Ilia Zaichuk, Vadim Zaliva, and Steve Zdancewic. Modular, compositional, and executable formal semantics for LLVM IR. Proc. ACM Program. Lang., 2021. URL: https://doi.org/10.1145/3473572.
Jianzhou Zhao, Santosh Nagarakatte, Milo MK Martin, and Steve Zdancewic. Formalizing the LLVM intermediate representation for verified program transformations. In Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 427-440, 2012.
Jianzhou Zhao, Santosh Nagarakatte, Milo MK Martin, and Steve Zdancewic. Formal verification of SSA-based optimizations for LLVM. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation, pages 175-186, 2013.

Verifying Peephole Rewriting in SSA Compiler IRs

Authors Siddharth Bhat , Alex Keizer , Chris Hughes, Andrés Goens , Tobias Grosser

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Verifying Peephole Rewriting in SSA Compiler IRs

Authors Siddharth Bhat , Alex Keizer , Chris Hughes, Andrés Goens , Tobias Grosser

File

Document Identifiers

Author Details

Funding

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Supplementary Materials

References

Thanks for your feedback!

Could not send message