Modular Abstract Definitional Interpreters for WebAssembly

Authors Katharina Brandl, Sebastian Erdweg, Sven Keidel, Nils Hansen



PDF
Thumbnail PDF

File

LIPIcs.ECOOP.2023.5.pdf
  • Filesize: 0.74 MB
  • 28 pages

Document Identifiers

Author Details

Katharina Brandl
  • Johannes Gutenberg-Universität Mainz, Germany
Sebastian Erdweg
  • Johannes Gutenberg-Universität Mainz, Germany
Sven Keidel
  • TU Darmstadt, Germany
Nils Hansen
  • Johannes Gutenberg-Universität Mainz, Germany

Acknowledgements

We thank the anonymous reviewers for their effort and helpful suggestions.

Cite As Get BibTex

Katharina Brandl, Sebastian Erdweg, Sven Keidel, and Nils Hansen. Modular Abstract Definitional Interpreters for WebAssembly. In 37th European Conference on Object-Oriented Programming (ECOOP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 263, pp. 5:1-5:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023) https://doi.org/10.4230/LIPIcs.ECOOP.2023.5

Abstract

Even though static analyses can improve performance and secure programs against vulnerabilities, no static whole-program analyses exist for WebAssembly (Wasm) to date. Part of the reason is that Wasm has many complex language concerns, and it is not obvious how to adopt existing analysis frameworks for these features. This paper explores how abstract definitional interpretation can be used to develop sophisticated analyses for Wasm and other complex languages efficiently. In particular, we show that the semantics of Wasm can be decomposed into 19 language-independent components that abstract different aspects of Wasm. We have written a highly configurable definitional interpreter for full Wasm 1.0 in 1628 LOC against these components. Analysis developers can instantiate this interpreter with different value and effect abstractions to obtain abstract definitional interpreters that compute inter-procedural control and data-flow information. This way, we develop the first whole-program dead code, constant propagation, and taint analyses for Wasm, each in less than 210 LOC. We evaluate our analyses on 1458 Wasm binaries collected by others in the wild. Our implementation is based on a novel framework for definitional abstract interpretation in Scala that eliminates scalability issues of prior work.

Subject Classification

ACM Subject Classification
  • Software and its engineering → Automated static analysis
Keywords
  • Static Analysis
  • WebAssembly

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick D. McDaniel. Flowdroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In Michael F. P. O'Boyle and Keshav Pingali, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, Edinburgh, United Kingdom - June 09 - 11, 2014, pages 259-269. ACM, 2014. URL: https://doi.org/10.1145/2594291.2594299.
  2. Darren C. Atkinson. Accurate call graph extraction of programs with function pointers using type signatures. In 11th Asia-Pacific Software Engineering Conference (APSEC 2004), 30 November - 3 December 2004, Busan, Korea, pages 326-335. IEEE Computer Society, 2004. URL: https://doi.org/10.1109/APSEC.2004.16.
  3. Gogul Balakrishnan and Thomas W. Reps. Analyzing memory accesses in x86 executables. In Evelyn Duesterwald, editor, Compiler Construction, 13th International Conference, CC 2004, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2004, Barcelona, Spain, March 29 - April 2, 2004, Proceedings, volume 2985 of Lecture Notes in Computer Science, pages 5-23. Springer, 2004. URL: https://doi.org/10.1007/978-3-540-24723-4_2.
  4. Roberto Barbuti, Nicoletta De Francesco, and Luca Tesei. An abstract interpretation approach for enhancing the java bytecode verifier. Comput. J., 53(6):679-700, 2010. URL: https://doi.org/10.1093/comjnl/bxp031.
  5. Eric Bodden. Inter-procedural data-flow analysis with IFDS/IDE and soot. In Eric Bodden, Laurie J. Hendren, Patrick Lam, and Elena Sherman, editors, Proceedings of the ACM SIGPLAN International Workshop on State of the Art in Java Program analysis, SOAP 2012, Beijing, China, June 14, 2012, pages 3-8. ACM, 2012. URL: https://doi.org/10.1145/2259051.2259052.
  6. Marco Cova, Viktoria Felmetsger, Greg Banks, and Giovanni Vigna. Static detection of vulnerabilities in x86 executables. In 2006 22nd Annual Computer Security Applications Conference (ACSAC'06), pages 269-278. IEEE, 2006. Google Scholar
  7. David Darais, Nicholas Labich, Phuc C. Nguyen, and David Van Horn. Abstracting definitional interpreters (functional pearl). PACMPL, 1(ICFP):12:1-12:25, 2017. Google Scholar
  8. Julian Dolby, Stephen J Fink, and Manu Sridharan. Watson libraries for analysis (wala). URL: http://wala.sf.net/.
  9. Neville Grech and Yannis Smaragdakis. P/taint: unified points-to and taint analysis. Proc. ACM Program. Lang., 1(OOPSLA):102:1-102:28, 2017. URL: https://doi.org/10.1145/3133926.
  10. Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and J. F. Bastien. Bringing the web up to speed with webassembly. In Albert Cohen and Martin T. Vechev, editors, Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, pages 185-200. ACM, 2017. URL: https://doi.org/10.1145/3062341.3062363.
  11. Aaron Hilbig, Daniel Lehmann, and Michael Pradel. An empirical study of real-world webassembly binaries: Security, languages, use cases. In WWW: The Web Conference, pages 2696-2708. ACM / IW3C2, 2021. Google Scholar
  12. David Van Horn and Matthew Might. Abstracting abstract machines. In Proceeding of the 15th ACM SIGPLAN international conference on Functional programming, ICFP 2010, Baltimore, Maryland, USA, September 27-29, 2010, pages 51-62. ACM, 2010. Google Scholar
  13. Sven Keidel and Sebastian Erdweg. Sound and reusable components for abstract interpretation. Proc. ACM Program. Lang., 3(OOPSLA), October 2019. URL: https://doi.org/10.1145/3360602.
  14. Sven Keidel, Casper Bach Poulsen, and Sebastian Erdweg. Compositional soundness proofs of abstract interpreters. Proceedings of the ACM on Programming Languages, 2(ICFP):1-26, 2018. Google Scholar
  15. Johannes Kinder. Static analysis of x86 executables (Statische Analyse von Programmen in x86-Maschinensprache). PhD thesis, Darmstadt University of Technology, 2010. URL: http://tuprints.ulb.tu-darmstadt.de/2338/.
  16. Johannes Kinder and Helmut Veith. Jakstab: A static analysis platform for binaries. In Aarti Gupta and Sharad Malik, editors, Computer Aided Verification, 20th International Conference, CAV 2008, Princeton, NJ, USA, July 7-14, 2008, Proceedings, volume 5123 of Lecture Notes in Computer Science, pages 423-427. Springer, 2008. URL: https://doi.org/10.1007/978-3-540-70545-1_40.
  17. Johannes Kinder, Florian Zuleger, and Helmut Veith. An abstract interpretation-based framework for control flow reconstruction from binaries. In Neil D. Jones and Markus Müller-Olm, editors, Verification, Model Checking, and Abstract Interpretation, 10th International Conference, VMCAI 2009, Savannah, GA, USA, January 18-20, 2009. Proceedings, volume 5403 of Lecture Notes in Computer Science, pages 214-228. Springer, 2009. URL: https://doi.org/10.1007/978-3-540-93900-9_19.
  18. István Koren. A standalone webassembly development environment for the internet of things. In Marco Brambilla, Richard Chbeir, Flavius Frasincar, and Ioana Manolescu, editors, Web Engineering, pages 353-360, Cham, 2021. Springer International Publishing. Google Scholar
  19. Erik Krogh Kristensen and Anders Møller. Reasonably-most-general clients for javascript library analysis. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019, pages 83-93. IEEE / ACM, 2019. Google Scholar
  20. Daniel Lehmann, Johannes Kinder, and Michael Pradel. Everything old is new again: Binary security of webassembly. In 29th USENIX Security Symposium (USENIX Security 20), pages 217-234. USENIX Association, August 2020. URL: https://www.usenix.org/conference/usenixsecurity20/presentation/lehmann.
  21. Daniel Lehmann and Michael Pradel. Wasabi: A framework for dynamically analyzing webassembly. In Iris Bahar, Maurice Herlihy, Emmett Witchel, and Alvin R. Lebeck, editors, Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2019, Providence, RI, USA, April 13-17, 2019, pages 1045-1058. ACM, 2019. URL: https://doi.org/10.1145/3297858.3304068.
  22. Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. Addison-Wesley, 1997. Google Scholar
  23. Marius Musch, Christian Wressnegger, Martin Johns, and Konrad Rieck. New kid on the web: A study on the prevalence of webassembly in the wild. In Roberto Perdisci, Clémentine Maurice, Giorgio Giacinto, and Magnus Almgren, editors, Detection of Intrusions and Malware, and Vulnerability Assessment - 16th International Conference, DIMVA 2019, Gothenburg, Sweden, June 19-20, 2019, Proceedings, volume 11543 of Lecture Notes in Computer Science, pages 23-42. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-22038-9_2.
  24. Minh Hai Nguyen, Thien Binh Nguyen, Thanh Tho Quan, and Mizuhito Ogawa. A hybrid approach for control flow graph construction from binary code. In Pornsiri Muenchaisri and Gregg Rothermel, editors, 20th Asia-Pacific Software Engineering Conference, APSEC 2013, Ratchathewi, Bangkok, Thailand, December 2-5, 2013 - Volume 2, pages 159-164. IEEE Computer Society, 2013. URL: https://doi.org/10.1109/APSEC.2013.132.
  25. Yannis Smaragdakis and George Balatsouras. Pointer analysis. Found. Trends Program. Lang., 2(1):1-69, 2015. URL: https://doi.org/10.1561/2500000014.
  26. Johannes Späth, Karim Ali, and Eric Bodden. Ide^al: efficient and precise alias-aware dataflow analysis. Proc. ACM Program. Lang., 1(OOPSLA):99:1-99:27, 2017. URL: https://doi.org/10.1145/3133923.
  27. Fausto Spoto. The julia static analyzer for java. In Xavier Rival, editor, Static Analysis - 23rd International Symposium, SAS 2016, Edinburgh, UK, September 8-10, 2016, Proceedings, volume 9837 of Lecture Notes in Computer Science, pages 39-57. Springer, 2016. URL: https://doi.org/10.1007/978-3-662-53413-7_3.
  28. Quentin Stiévenart and Coen De Roover. Compositional information flow analysis for webassembly programs. In 20th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2020, Adelaide, Australia, September 28 - October 2, 2020, pages 13-24. IEEE, 2020. URL: https://doi.org/10.1109/SCAM51674.2020.00007.
  29. Quentin Stiévenart, Coen De Roover, and Mohammad Ghafari. The security risk of lacking compiler protection in webassembly, 2021. URL: https://arxiv.org/abs/2111.01421.
  30. Tayssir Touili and Xin Ye. Reachability analysis of self modifying code. In 22nd International Conference on Engineering of Complex Computer Systems, ICECCS 2017, Fukuoka, Japan, November 5-8, 2017, pages 120-127. IEEE Computer Society, 2017. URL: https://doi.org/10.1109/ICECCS.2017.19.
  31. Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie J. Hendren, Patrick Lam, and Vijay Sundaresan. Soot - a java bytecode optimization framework. In Stephen A. MacKay and J. Howard Johnson, editors, Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative Research, November 8-11, 1999, Mississauga, Ontario, Canada, page 13. IBM, 1999. URL: https://dl.acm.org/citation.cfm?id=782008.
  32. Raja Vallee-rai and Laurie Hendren. Jimple: Simplifying java bytecode for analyses and transformations, 1998. Google Scholar
  33. Richard Wartell, Yan Zhou, Kevin W. Hamlen, Murat Kantarcioglu, and Bhavani M. Thuraisingham. Differentiating code from data in x86 binaries. In Dimitrios Gunopulos, Thomas Hofmann, Donato Malerba, and Michalis Vazirgiannis, editors, Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2011, Athens, Greece, September 5-9, 2011, Proceedings, Part III, volume 6913 of Lecture Notes in Computer Science, pages 522-536. Springer, 2011. URL: https://doi.org/10.1007/978-3-642-23808-6_34.
  34. Conrad Watt, Xiaojia Rao, Jean Pichon-Pharabod, Martin Bodin, and Philippa Gardner. Two Mechanisations of WebAssembly 1.0. In FM 2021 - Formal Methods, pages 1-19, Beijing, China, November 2021. URL: https://hal.archives-ouvertes.fr/hal-03353748.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail