Putting Randomized Compiler Testing into Production (Experience Report)

Authors Alastair F. Donaldson , Hugues Evrard, Paul Thomson

Thumbnail PDF


  • Filesize: 0.79 MB
  • 29 pages

Document Identifiers

Author Details

Alastair F. Donaldson
  • Google, London, United Kingdom
  • Imperial College London, United Kingdom
Hugues Evrard
  • Google, London, United Kingdom
Paul Thomson
  • Google, London, United Kingdom


We are grateful to David Neto and to the anonymous ECOOP 2020 reviewers for their feedback on an earlier draft of this work.

Cite AsGet BibTex

Alastair F. Donaldson, Hugues Evrard, and Paul Thomson. Putting Randomized Compiler Testing into Production (Experience Report). In 34th European Conference on Object-Oriented Programming (ECOOP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 166, pp. 22:1-22:29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


We describe our experience over the last 18 months on a compiler testing technology transfer project: taking the GraphicsFuzz research project on randomized metamorphic testing of graphics shader compilers, and building the necessary tooling around it to provide a highly automated process for improving the Khronos Vulkan Conformance Test Suite (CTS) with test cases that expose fuzzer-found compiler bugs, or that plug gaps in test coverage. We present this tooling for test automation - gfauto - in detail, as well as our use of differential coverage and test case reduction as a method for automatically synthesizing tests that fill coverage gaps. We explain the value that GraphicsFuzz has provided in automatically testing the ecosystem of tools for transforming, optimizing and validating Vulkan shaders, and the challenges faced when testing a tool ecosystem rather than a single tool. We discuss practical issues associated with putting automated metamorphic testing into production, related to test case validity, bug de-duplication and floating-point precision, and provide illustrative examples of bugs found during our work.

Subject Classification

ACM Subject Classification
  • Software and its engineering → Compilers
  • Software and its engineering → Software testing and debugging
  • Compilers
  • metamorphic testing
  • 3D graphics
  • experience report


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Paul Ammann and Jeff Offutt. Introduction to Software Testing. Cambridge University Press, 2 edition, 2017. Google Scholar
  2. Apple. About the security content of ios 10.3, 2017. see "Processing maliciously crafted web content may result in the disclosure of process memory". URL: https://support.apple.com/en-gb/HT207617.
  3. Abdulazeez S. Boujarwah and Kassem Saleh. Compiler test case generation methods: a survey and assessment. Information & Software Technology, 39(9):617-625, 1997. URL: https://doi.org/10.1016/S0950-5849(97)00017-7.
  4. Arkady Bron, Eitan Farchi, Yonit Magid, Yarden Nir, and Shmuel Ur. Applications of synchronization coverage. In Keshav Pingali, Katherine A. Yelick, and Andrew S. Grimshaw, editors, Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2005, June 15-17, 2005, Chicago, IL, USA, pages 206-212. ACM, 2005. URL: https://doi.org/10.1145/1065944.1065972.
  5. bugs.chromium.org. Issue 675658: Security: Malicious WebGL page can capture and upload contents of other tabs, 2016. URL: https://bugs.chromium.org/p/chromium/issues/detail?id=675658.
  6. Cristian Cadar and Alastair F. Donaldson. Analysing the program analyser. In Laura K. Dillon, Willem Visser, and Laurie Williams, editors, Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016 - Companion Volume, pages 765-768. ACM, 2016. URL: https://doi.org/10.1145/2889160.2889206.
  7. Junjie Chen, Wenxiang Hu, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, and Bing Xie. An empirical comparison of compiler testing techniques. In Laura K. Dillon, Willem Visser, and Laurie Williams, editors, Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016, pages 180-190. ACM, 2016. URL: https://doi.org/10.1145/2884781.2884878.
  8. Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. A survey of compiler testing techniques. ACM Computing Surveys, 2020. To appear. Google Scholar
  9. T.Y. Chen, S.C. Cheung, and S.M. Yiu. Metamorphic testing: a new approach for generating next test cases. Technical Report HKUST-CS98-01, Department of Computer Science, The Hong Kong University of Science and Technology, 1998. Google Scholar
  10. Yang Chen, Alex Groce, Chaoqiang Zhang, Weng-Keen Wong, Xiaoli Z. Fern, Eric Eide, and John Regehr. Taming compiler fuzzers. In Hans-Juergen Boehm and Cormac Flanagan, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '13, Seattle, WA, USA, June 16-19, 2013, pages 197-208. ACM, 2013. URL: https://doi.org/10.1145/2491956.2462173.
  11. Igalia / codecov.io. Coverage report for vulkan cts on open source mesa driver with amd bck-end, 2020. URL: https://codecov.io/gh/Igalia/mesa/.
  12. Keith Cooper and Linda Torczon. Engineering a Compiler. Morgan Kaufmann, 2002. Google Scholar
  13. Pascal Cuoq, Benjamin Monate, Anne Pacalet, Virgile Prevosto, John Regehr, Boris Yakobowski, and Xuejun Yang. Testing static analyzers with randomly generated programs. In Alwyn Goodloe and Suzette Person, editors, NASA Formal Methods - 4th International Symposium, NFM 2012, Norfolk, VA, USA, April 3-5, 2012. Proceedings, volume 7226 of Lecture Notes in Computer Science, pages 120-125. Springer, 2012. URL: https://doi.org/10.1007/978-3-642-28891-3_12.
  14. Brett Daniel, Danny Dig, Kely Garcia, and Darko Marinov. Automated testing of refactoring engines. In Ivica Crnkovic and Antonia Bertolino, editors, Proceedings of the 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2007, Dubrovnik, Croatia, September 3-7, 2007, pages 185-194. ACM, 2007. URL: https://doi.org/10.1145/1287624.1287651.
  15. Alastair F. Donaldson, Hugues Evrard, Andrei Lascu, and Paul Thomson. Automated testing of graphics shader compilers. PACMPL, 1(OOPSLA):93:1-93:29, 2017. URL: https://doi.org/10.1145/3133917.
  16. Alastair F. Donaldson and Andrei Lascu. Metamorphic testing for (graphics) compilers. In Proceedings of the 1st International Workshop on Metamorphic Testing, MET@ICSE 2016, Austin, Texas, USA, May 16, 2016, pages 44-47. ACM, 2016. URL: https://doi.org/10.1145/2896971.2896978.
  17. Google. Amber GitHub repository, 2020. URL: https://github.com/google/amber.
  18. Google. SwiftShader GitHub repository, 2020. URL: https://github.com/google/SwiftShader.
  19. GPUOpen Drivers. LLVM-based pipeline compiler GitHub repository, 2020. URL: https://github.com/GPUOpen-Drivers/llpc.
  20. The Khronos Vulkan Working Group. Vulkan 1.1.141 - A Specification (with all registered Vulkan extensions). The Khronos Group, 2019. URL: https://www.khronos.org/registry/vulkan/specs/1.1-extensions/pdf/vkspec.pdf.
  21. K. V. Hanford. Automatic generation of test cases. IBM Systems Journal, 9:242-257, 1970. Google Scholar
  22. John Kessenich, editor. The OpenGL Shading Language Version 4.60.7. The Khronos Group, 2019. URL: https://www.khronos.org/registry/OpenGL/specs/gl/GLSLangSpec.4.60.pdf.
  23. John Kessenich, Boaz Ouriel, and Raun Krisch, editors. SPIR-V Specification, Version 1.5, Revision 2, Unified. The Khronos Group, 2019. URL: https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.pdf.
  24. Khronos Group. glslang GitHub repository, 2020. URL: https://github.com/KhronosGroup/glslang.
  25. Khronos Group. Khronos Vulkan, OpenGL, and OpenGL ES conformance tests GitHub repository, 2020. URL: https://github.com/KhronosGroup/VK-GL-CTS.
  26. Khronos Group. MoltenVk GitHub repository, 2020. URL: https://github.com/KhronosGroup/MoltenVK.
  27. Khronos Group. SPIR-V Tools GitHub repository, 2020. URL: https://github.com/KhronosGroup/SPIRV-Tools.
  28. Khronos Group. SPIRV-Cross GitHub repository, 2020. URL: https://github.com/KhronosGroup/SPIRV-Cross.
  29. Jeffery Kline. Properties of the d-dimensional earth mover’s problem. Discrete Applied Mathematics, 265:128-141, 2019. URL: https://doi.org/10.1016/j.dam.2019.02.042.
  30. Alexander S. Kossatchev and Mikhail Posypkin. Survey of compiler testing methods. Programming and Computer Software, 31(1):10-19, 2005. URL: https://doi.org/10.1007/s11086-005-0008-6.
  31. Vu Le, Mehrdad Afshari, and Zhendong Su. Compiler validation via equivalence modulo inputs. In Michael F. P. O'Boyle and Keshav Pingali, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, Edinburgh, United Kingdom - June 09 - 11, 2014, pages 216-226. ACM, 2014. URL: https://doi.org/10.1145/2594291.2594334.
  32. Jon Leech, editor. OpenGL ES Version 3.2. The Khronos Group, 2019. URL: https://www.khronos.org/registry/OpenGL/specs/es/3.2/es_spec_3.2.pdf.
  33. Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. Many-core compiler fuzzing. In David Grove and Steve Blackburn, editors, Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 65-76. ACM, 2015. URL: https://doi.org/10.1145/2737924.2737986.
  34. David R. MacIver and Alastair F. Donaldson. Test-case reduction via test-case generation: Insights from the hypothesis reducer. In 34th European Conference on Object-Oriented Programming, ECOOP 2020, volume 166 of LIPIcs, pages 13:1-13:28. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020. Google Scholar
  35. Michaël Marcozzi, Qiyi Tang, Alastair F. Donaldson, and Cristian Cadar. Compiler fuzzing: how much does it matter? PACMPL, 3(OOPSLA):155:1-155:29, 2019. URL: https://doi.org/10.1145/3360581.
  36. William M. McKeeman. Differential testing for software. Digital Technical Journal, 10(1):100-107, 1998. URL: http://www.hpl.hp.com/hpjournal/dtj/vol10num1/vol10num1art9.pdf.
  37. Microsoft. DirectX shader compiler GitHub repository, 2020. URL: https://github.com/microsoft/DirectXShaderCompiler.
  38. NVIDIA. Security bulletin: Nvidia gpu display driver contains multiple vulnerabilities in the kernel mode layer handler, 2018. , see "NVIDIA GPU Display Driver contains a vulnerability in the kernel mode layer handler where an incorrect detection and recovery from an invalid state produced by specific user actions may lead to a denial of service". URL: https://nvidia.custhelp.com/app/answers/detail/a_id/4525/.
  39. Moritz Pflanzer, Alastair F. Donaldson, and Andrei Lascu. Automatic test case reduction for opencl. In Proceedings of the 4th International Workshop on OpenCL, IWOCL 2016, Vienna, Austria, April 19-21, 2016, pages 1:1-1:12. ACM, 2016. URL: https://doi.org/10.1145/2909437.2909439.
  40. John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. Test-case reduction for C compiler bugs. In Jan Vitek, Haibo Lin, and Frank Tip, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '12, Beijing, China - June 11 - 16, 2012, pages 335-346. ACM, 2012. URL: https://doi.org/10.1145/2254064.2254104.
  41. Mark Segal and Kurt Akeley, editors. The OpenGL Graphics System: A Specification Version 4.6 (Core Profile). The Khronos Group, 2019. URL: https://www.khronos.org/registry/OpenGL/specs/gl/glspec46.core.pdf.
  42. Sergio Segura, Gordon Fraser, Ana B. Sánchez, and Antonio Ruiz Cortés. A survey on metamorphic testing. IEEE Trans. Software Eng., 42(9):805-824, 2016. URL: https://doi.org/10.1109/TSE.2016.2532875.
  43. Robert J. Simpson and John Kessenich, editors. The OpenGL ES Shading Language Version 3.20.6. The Khronos Group, 2019. URL: https://www.khronos.org/registry/OpenGL/specs/es/3.2/GLSL_ES_Specification_3.20.pdf.
  44. Chengnian Sun, Vu Le, and Zhendong Su. Finding compiler bugs via live code mutation. In Eelco Visser and Yannis Smaragdakis, editors, Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2016, part of SPLASH 2016, Amsterdam, The Netherlands, October 30 - November 4, 2016, pages 849-863. ACM, 2016. URL: https://doi.org/10.1145/2983990.2984038.
  45. Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. Toward understanding compiler bugs in GCC and LLVM. In Andreas Zeller and Abhik Roychoudhury, editors, Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, Saarbrücken, Germany, July 18-20, 2016, pages 294-305. ACM, 2016. URL: https://doi.org/10.1145/2931037.2931074.
  46. Qiuming Tao, Wei Wu, Chen Zhao, and Wuwei Shen. An automatic testing approach for compiler based on metamorphic testing technique. In Jun Han and Tran Dan Thu, editors, 17th Asia Pacific Software Engineering Conference, APSEC 2010, Sydney, Australia, November 30 - December 3, 2010, pages 270-279. IEEE Computer Society, 2010. URL: https://doi.org/10.1109/APSEC.2010.39.
  47. Stéfan van der Walt, Johannes L. Schönberger, Juan Nunez-Iglesias, François Boulogne, Joshua D. Warner, Neil Yager, Emmanuelle Gouillart, Tony Yu, and the scikit-image contributors. scikit-image: image processing in Python. PeerJ, 2:e453, June 2014. URL: https://doi.org/10.7717/peerj.453.
  48. Brian A. Wichmann. Some remarks about random testing, 1998. Available online at URL: https://www.semanticscholar.org/paper/Some-Remarks-about-Random-Testing-Wichmann/2ad3c4c2e1b0b5867a1aa3e7c2de4a17d9facead.
  49. Brian A. Wichmann and Z. J. Ciechanowicz, editors. Pascal Compiler Validation. John Wiley & Sons, Inc., 1983. Google Scholar
  50. Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. Finding and understanding bugs in C compilers. In Mary W. Hall and David A. Padua, editors, Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011, pages 283-294. ACM, 2011. URL: https://doi.org/10.1145/1993498.1993532.
  51. Andreas Zeller and Ralf Hildebrandt. Simplifying and isolating failure-inducing input. IEEE Trans. Software Eng., 28(2):183-200, 2002. URL: https://doi.org/10.1109/32.988498.
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail