Verifying Peephole Rewriting in SSA Compiler IRs

Authors Siddharth Bhat , Alex Keizer , Chris Hughes, Andrés Goens , Tobias Grosser



PDF
Thumbnail PDF

File

LIPIcs.ITP.2024.9.pdf
  • Filesize: 0.79 MB
  • 20 pages

Document Identifiers

Author Details

Siddharth Bhat
  • Cambridge University, UK
Alex Keizer
  • Cambridge University, UK
Chris Hughes
  • University of Edinburgh, UK
Andrés Goens
  • University of Amsterdam, The Netherlands
Tobias Grosser
  • Cambridge University, UK

Acknowledgements

We thank Sébastien Michelland and Sebastian Ullrich for their early help in this project and feedback, as well as Anton Lorenzen for his helpful feedback.

Cite AsGet BibTex

Siddharth Bhat, Alex Keizer, Chris Hughes, Andrés Goens, and Tobias Grosser. Verifying Peephole Rewriting in SSA Compiler IRs. In 15th International Conference on Interactive Theorem Proving (ITP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 309, pp. 9:1-9:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ITP.2024.9

Abstract

There is an increasing need for domain-specific reasoning in modern compilers. This has fueled the use of tailored intermediate representations (IRs) based on static single assignment (SSA), like in the MLIR compiler framework. Interactive theorem provers (ITPs) provide strong guarantees for the end-to-end verification of compilers (e.g., CompCert). However, modern compilers and their IRs evolve at a rate that makes proof engineering alongside them prohibitively expensive. Nevertheless, well-scoped push-button automated verification tools such as the Alive peephole verifier for LLVM-IR gained recognition in domains where SMT solvers offer efficient (semi) decision procedures. In this paper, we aim to combine the convenience of automation with the versatility of ITPs for verifying peephole rewrites across domain-specific IRs. We formalize a core calculus for SSA-based IRs that is generic over the IR and covers so-called regions (nested scoping used by many domain-specific IRs in the MLIR ecosystem). Our mechanization in the Lean proof assistant provides a user-friendly frontend for translating MLIR syntax into our calculus. We provide scaffolding for defining and verifying peephole rewrites, offering tactics to eliminate the abstraction overhead of our SSA calculus. We prove correctness theorems about peephole rewriting, as well as two classical program transformations. To evaluate our framework, we consider three use cases from the MLIR ecosystem that cover different levels of abstractions: (1) bitvector rewrites from LLVM, (2) structured control flow, and (3) fully homomorphic encryption. We envision that our mechanization provides a foundation for formally verified rewrites on new domain-specific IRs.

Subject Classification

ACM Subject Classification
  • Software and its engineering → Compilers
  • Software and its engineering → Semantics
  • Computing methodologies → Theorem proving algorithms
  • Theory of computation → Rewrite systems
Keywords
  • compilers
  • semantics
  • mechanization
  • MLIR
  • SSA
  • regions
  • peephole rewrites

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. AV Aho, R Sethi, and JD Ullman. Compilers: Principles, Techniques, and Tools. Citeseer, 1985. Google Scholar
  2. Andrew W Appel. SSA is functional programming. Acm Sigplan Notices, 33(4):17-20, 1998. Google Scholar
  3. Seongwon Bang, Seunghyeon Nam, Inwhan Chun, Ho Young Jhoo, and Juneyoung Lee. SMT-based translation validation for machine learning compiler. In Sharon Shoham and Yakir Vizel, editors, Computer Aided Verification - 34th International Conference, CAV 2022, Haifa, Israel, August 7-10, 2022, Proceedings, Part II, volume 13372 of Lecture Notes in Computer Science, pages 386-407. Springer, 2022. URL: https://doi.org/10.1007/978-3-031-13188-2_19.
  4. Clark Barrett, Aaron Stump, Cesare Tinelli, et al. The smt-lib standard: Version 2.0. In Proceedings of the 8th international workshop on satisfiability modulo theories (Edinburgh, UK), volume 13, page 14, 2010. Google Scholar
  5. Gilles Barthe, Delphine Demange, and David Pichardie. Formal verification of an SSA-based middle-end for CompCert. ACM Transactions on Programming Languages and Systems (TOPLAS), 36(1):1-35, 2014. Google Scholar
  6. Siddharth Bhat and Tobias Grosser. Lambda the ultimate ssa: optimizing functional programs in ssa. In 2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 1-11. IEEE, 2022. Google Scholar
  7. Nathanaël Courant and Xavier Leroy. Verified code generation for the polyhedral model. Proceedings of the ACM on Programming Languages, 5(POPL):1-24, 2021. Google Scholar
  8. Leonardo De Moura and Nikolaj Bjørner. Z3: An efficient SMT solver. In Tools and Algorithms for the Construction and Analysis of Systems: 14th International Conference, TACAS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29-April 6, 2008. Proceedings 14, pages 337-340. Springer, 2008. Google Scholar
  9. Leonardo de Moura and Sebastian Ullrich. The Lean 4 theorem prover and programming language. In International Conference on Automated Deduction, pages 625-635. Springer, 2021. Google Scholar
  10. Craig Gentry. A fully homomorphic encryption scheme. Stanford university, 2009. Google Scholar
  11. Benjamin Grégoire and Assia Mahboubi. Proving equalities in a commutative ring done right in Coq. In Joe Hurd and Thomas F. Melham, editors, Theorem Proving in Higher Order Logics, 18th International Conference, TPHOLs 2005, Oxford, UK, August 22-25, 2005, Proceedings, volume 3603 of Lecture Notes in Computer Science, pages 98-113. Springer, 2005. URL: https://doi.org/10.1007/11541868_7.
  12. Gérard Huet. The zipper. Journal of functional programming, 7(5):549-554, 1997. Google Scholar
  13. Petter Källström and Oscar Gustafsson. Fast and area efficient adder for wide data in recent Xilinx FPGAs. In 2016 26th International Conference on Field Programmable Logic and Applications (FPL), pages 1-4. IEEE, 2016. Google Scholar
  14. Jeehoon Kang, Yoonseung Kim, Chung-Kil Hur, Derek Dreyer, and Viktor Vafeiadis. Lightweight verification of separate compilation. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 178-190, 2016. Google Scholar
  15. Richard A Kelsey. A correspondence between continuation passing style and static single assignment form. ACM SIGPLAN Notices, 30(3):13-22, 1995. Google Scholar
  16. Jérémie Koenig and Zhong Shao. CompCertO: compiling certified open C components. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pages 1095-1109, 2021. Google Scholar
  17. Chris Lattner and Vikram Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004., pages 75-86. IEEE, 2004. Google Scholar
  18. Chris Lattner, Jacques Pienaar, Mehdi Amini, Uday Bondhugula, River Riddle, Albert Cohen, Tatiana Shpeisman, Andy Davis, Nicolas Vasilache, and Oleksandr Zinenko. MLIR: A compiler infrastructure for the end of Moore’s law. arXiv preprint arXiv:2002.11054, 2020. Google Scholar
  19. Juneyoung Lee, Chung-Kil Hur, and Nuno P Lopes. AliveInLean: a verified LLVM peephole optimization verifier. In International Conference on Computer Aided Verification, pages 445-455. Springer, 2019. Google Scholar
  20. Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-Kil Hur, Sanjoy Das, David Majnemer, John Regehr, and Nuno P Lopes. Taming undefined behavior in LLVM. ACM SIGPLAN Notices, 52(6):633-647, 2017. Google Scholar
  21. Xavier Leroy, Sandrine Blazy, Daniel Kästner, Bernhard Schommer, Markus Pister, and Christian Ferdinand. CompCert-a formally verified optimizing compiler. In ERTS 2016: Embedded Real Time Software and Systems, 8th European Congress, 2016. Google Scholar
  22. Liyi Li and Elsa L Gunter. K-LLVM: a relatively complete semantics of LLVM IR. In 34th European Conference on Object-Oriented Programming (ECOOP 2020), 2020. Google Scholar
  23. Nuno P Lopes, Juneyoung Lee, Chung-Kil Hur, Zhengyang Liu, and John Regehr. Alive2: bounded translation validation for LLVM. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pages 65-79, 2021. Google Scholar
  24. Nuno P Lopes, David Menendez, Santosh Nagarakatte, and John Regehr. Provably correct peephole optimizations with alive. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 22-32, 2015. Google Scholar
  25. Carl McConnell and Ralph E. Johnson. Using static single assignment form in a code optimizer. ACM Lett. Program. Lang. Syst., 1(2):152-160, June 1992. URL: https://doi.org/10.1145/151333.151368.
  26. William M McKeeman. Peephole optimization. Communications of the ACM, 8(7):443-444, 1965. Google Scholar
  27. Manasij Mukherjee and John Regehr. Hydra: Generalizing peephole optimizations with program synthesis. Proceedings of the ACM on Programming Languages, (OOPSLA), 2024. Google Scholar
  28. Eric Mullen, Daryl Zuniga, Zachary Tatlock, and Dan Grossman. Verified peephole optimizations for CompCert. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 448-461, 2016. Google Scholar
  29. Georg Neis, Chung-Kil Hur, Jan-Oliver Kaiser, Craig McLaughlin, Derek Dreyer, and Viktor Vafeiadis. Pilsner: A compositionally verified compiler for a higher-order imperative language. In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming, pages 166-178, 2015. Google Scholar
  30. Sunjae Park, Woosung Song, Seunghyeon Nam, Hyeongyu Kim, Junbum Shin, and Juneyoung Lee. HEaaN.MLIR: An optimizing compiler for fast ring-based homomorphic encryption. Proceedings of the 44th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2023. Google Scholar
  31. Daniel Patterson and Amal Ahmed. The next 700 compiler correctness theorems (functional pearl). Proceedings of the ACM on Programming Languages, 3(ICFP):1-29, 2019. Google Scholar
  32. Anurudh Peduri, Siddharth Bhat, and Tobias Grosser. QSSA: an SSA-based IR for quantum computing. In Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction, pages 2-14, 2022. Google Scholar
  33. Benjamin C Pierce and C Benjamin. Types and programming languages. MIT press, 2002. Google Scholar
  34. Fabrice Rastello and Florent Bouchez Tichadou. SSA-based Compiler Design. Springer Nature, 2022. Google Scholar
  35. Michael Sammler, Simon Spies, Youngju Song, Emanuele D'Osualdo, Robbert Krebbers, Deepak Garg, and Derek Dreyer. Dimsum: A decentralized approach to multi-language semantics and verification. Proceedings of the ACM on Programming Languages, 7(POPL):775-805, 2023. Google Scholar
  36. Youngju Song, Minki Cho, Dongjoo Kim, Yonghyun Kim, Jeehoon Kang, and Chung-Kil Hur. CompCertM: CompCert with C-assembly linking and lightweight modular verification. Proceedings of the ACM on Programming Languages, 4(POPL):1-31, 2019. Google Scholar
  37. Gordon Stewart, Lennart Beringer, Santiago Cuellar, and Andrew W Appel. Compositional CompCert. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 275-287, 2015. Google Scholar
  38. Sebastian Ullrich and Leonardo de Moura. Beyond notations: Hygienic macro expansion for theorem proving languages. In International Joint Conference on Automated Reasoning, pages 167-182. Springer, 2020. Google Scholar
  39. Nicolas Vasilache, Oleksandr Zinenko, Aart J. C. Bik, Mahesh Ravishankar, Thomas Raoux, Alexander Belyaev, Matthias Springer, Tobias Gysi, Diego Caballero, Stephan Herhut, Stella Laurenzo, and Albert Cohen. Composable and modular code generation in MLIR: A structured and retargetable approach to tensor compiler construction. CoRR, abs/2202.03293, 2022. URL: https://arxiv.org/abs/2202.03293.
  40. Alexander Viand, Patrick Jattke, Miro Haller, and Anwar Hithnawi. HECO: Automatic code optimizations for efficient fully homomorphic encryption. arXiv preprint arXiv:2202.01649, 2022. Google Scholar
  41. Wei Wang and Xinming Huang. A novel fast modular multiplier architecture for 8,192-bit RSA cryposystem. In 2013 IEEE High Performance Extreme Computing Conference (HPEC), pages 1-5. IEEE, 2013. Google Scholar
  42. Yuting Wang, Pierre Wilke, and Zhong Shao. An abstract stack based approach to verified compositional compilation to machine code. Proceedings of the ACM on Programming Languages, 3(POPL):1-30, 2019. Google Scholar
  43. Dominik Winterer, Chengyu Zhang, and Zhendong Su. Validating SMT solvers via semantic fusion. In Alastair F. Donaldson and Emina Torlak, editors, Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020, pages 718-730. ACM, 2020. URL: https://doi.org/10.1145/3385412.3385985.
  44. Li-yao Xia, Yannick Zakowski, Paul He, Chung-Kil Hur, Gregory Malecha, Benjamin C Pierce, and Steve Zdancewic. Interaction trees: representing recursive and impure programs in Coq. arXiv preprint arXiv:1906.00046, 2019. Google Scholar
  45. Yannick Zakowski, Calvin Beck, Irene Yoon, Ilia Zaichuk, Vadim Zaliva, and Steve Zdancewic. Modular, compositional, and executable formal semantics for LLVM IR. Proc. ACM Program. Lang., 2021. URL: https://doi.org/10.1145/3473572.
  46. Jianzhou Zhao, Santosh Nagarakatte, Milo MK Martin, and Steve Zdancewic. Formalizing the LLVM intermediate representation for verified program transformations. In Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 427-440, 2012. Google Scholar
  47. Jianzhou Zhao, Santosh Nagarakatte, Milo MK Martin, and Steve Zdancewic. Formal verification of SSA-based optimizations for LLVM. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation, pages 175-186, 2013. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail