Static Type Analysis by Abstract Interpretation of Python Programs

Authors Raphaël Monat , Abdelraouf Ouadjaout , Antoine Miné



PDF
Thumbnail PDF

File

LIPIcs.ECOOP.2020.17.pdf
  • Filesize: 0.76 MB
  • 29 pages

Document Identifiers

Author Details

Raphaël Monat
  • Sorbonne Université, CNRS, LIP6, Paris, France
Abdelraouf Ouadjaout
  • Sorbonne Université, CNRS, LIP6, Paris, France
Antoine Miné
  • Sorbonne Université, CNRS, LIP6, Paris, France
  • Institut Universitaire de France, Paris, France

Acknowledgements

We thank the anonymous reviewers for their valuable comments and feedback.

Cite As Get BibTex

Raphaël Monat, Abdelraouf Ouadjaout, and Antoine Miné. Static Type Analysis by Abstract Interpretation of Python Programs. In 34th European Conference on Object-Oriented Programming (ECOOP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 166, pp. 17:1-17:29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.ECOOP.2020.17

Abstract

Python is an increasingly popular dynamic programming language, particularly used in the scientific community and well-known for its powerful and permissive high-level syntax. Our work aims at detecting statically and automatically type errors. As these type errors are exceptions that can be caught later on, we precisely track all exceptions (raised or caught). We designed a static analysis by abstract interpretation able to infer the possible types of variables, taking into account the full control-flow. It handles both typing paradigms used in Python, nominal and structural, supports Python’s object model, introspection operators allowing dynamic type testing, dynamic attribute addition, as well as exception handling. We present a flow- and context-sensitive analysis with special domains to support containers (such as lists) and infer type equalities (allowing it to express parametric polymorphism). The analysis is soundly derived by abstract interpretation from a concrete semantics of Python developed by Fromherz et al. Our analysis is designed in a modular way as a set of domains abstracting a concrete collecting semantics. It has been implemented into the MOPSA analysis framework, and leverages external type annotations from the Typeshed project to support the vast standard library. We show that it scales to benchmarks a few thousand lines long, and preliminary results show it is able to analyze a small real-life command-line utility called PathPicker. Compared to previous work, it is sound, while it keeps similar efficiency and precision.

Subject Classification

ACM Subject Classification
  • Theory of computation → Program analysis
  • Software and its engineering → Semantics
Keywords
  • Formal Methods
  • Static Analysis
  • Abstract Interpretation
  • Type Analysis
  • Dynamic Programming Language
  • Python Semantics

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Roberto Amadini, Alexander Jordan, Graeme Gange, François Gauthier, Peter Schachte, Harald Søndergaard, Peter J. Stuckey, and Chenyi Zhang. Combining string abstract domains for JavaScript analysis: An evaluation. In TACAS (1), volume 10205 of LNCS, pages 41-57, 2017. Google Scholar
  2. Davide Ancona, Massimo Ancona, Antonio Cuni, and Nicholas D. Matsakis. Rpython: a step towards reconciling dynamically and statically typed OO languages. In DLS, pages 53-64. ACM, 2007. Google Scholar
  3. Gogul Balakrishnan and Thomas W. Reps. Recency-abstraction for heap-allocated storage. In SAS, volume 4134 of LNCS, pages 221-239. Springer, 2006. Google Scholar
  4. Kim Barrett, Bob Cassels, Paul Haahr, David A. Moon, Keith Playford, and P. Tucker Withington. A monotonic superclass linearization for dylan. In OOPSLA, pages 69-82. ACM, 1996. Google Scholar
  5. Julien Bertrane, Patrick Cousot, Radhia Cousot, Jérôme Feret, Laurent Mauborgne, Antoine Miné, and Xavier Rival. Static analysis and verification of aerospace software by abstract interpretation. Foundations and Trends in Programming Languages, 2(2-3):71-190, 2015. Google Scholar
  6. Bruno Blanchet, Patrick Cousot, Radhia Cousot, Jérôme Feret, Laurent Mauborgne, Antoine Miné, David Monniaux, and Xavier Rival. Design and implementation of a special-purpose static program analyzer for safety-critical real-time embedded software. In The Essence of Computation, volume 2566 of LNCS, pages 85-108. Springer, 2002. Google Scholar
  7. Martin Bodin, Arthur Charguéraud, Daniele Filaretti, Philippa Gardner, Sergio Maffeis, Daiva Naudziuniene, Alan Schmitt, and Gareth Smith. A trusted mechanised JavaScript specification. In POPL, pages 87-100. ACM, 2014. Google Scholar
  8. Patrick Cousot. Types as abstract interpretations. In POPL, pages 316-331. ACM Press, 1997. Google Scholar
  9. Patrick Cousot and Radhia Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL, pages 238-252. ACM, 1977. Google Scholar
  10. Pascal Cuoq, Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Signoles, and Boris Yakobowski. Frama-C - A software analysis perspective. In SEFM, volume 7504 of LNCS, pages 233-247. Springer, 2012. Google Scholar
  11. David Delmas, Eric Goubault, Sylvie Putot, Jean Souyris, Karim Tekkal, and Franck Védrine. Towards an industrial use of FLUCTUAT on safety-critical avionics software. In FMICS, volume 5825 of LNCS, pages 53-69. Springer, 2009. Google Scholar
  12. Levin Fritz and Jurriaan Hage. Cost versus precision for approximate typing for Python. In PEPM, pages 89-98. ACM, 2017. Google Scholar
  13. Aymeric Fromherz, Abdelraouf Ouadjaout, and Antoine Miné. Static value analysis of Python programs by abstract interpretation. In NFM, volume 10811 of LNCS, pages 185-202. Springer, 2018. Google Scholar
  14. Ronald Garcia, Alison M. Clark, and Éric Tanter. Abstracting gradual typing. In POPL, pages 429-442. ACM, 2016. Google Scholar
  15. Dwight Guth. A formal semantics of Python 3.3. Technical report, University of Illinois, 2013. URL: https://www.ideals.illinois.edu/bitstream/handle/2142/45275/Dwight_Guth.pdf?sequence=1&isAllowed=y.
  16. Mostafa Hassan, Caterina Urban, Marco Eilers, and Peter Müller. MaxSMT-based type inference for Python 3. In CAV (2), volume 10982 of LNCS, pages 12-19. Springer, 2018. Google Scholar
  17. Simon Holm Jensen, Peter A. Jonsson, and Anders Møller. Remedying the eval that men do. In ISSTA, pages 34-44. ACM, 2012. Google Scholar
  18. Simon Holm Jensen, Anders Møller, and Peter Thiemann. Type analysis for JavaScript. In SAS, volume 5673 of LNCS, pages 238-255. Springer, 2009. Google Scholar
  19. Simon Holm Jensen, Anders Møller, and Peter Thiemann. Interprocedural analysis with lazy propagation. In SAS, volume 6337 of LNCS, pages 320-339. Springer, 2010. Google Scholar
  20. M. Journault, A. Miné, R. Monat, and A. Ouadjaout. Combinations of reusable abstract domains for a multilingual static analyzer. In Proc. of the 11th Working Conference on Verified Software: Theories, Tools, and Experiments (VSTTE19), pages 1-17, July 2019. Google Scholar
  21. Francesco Logozzo and Herman Venter. RATA: Rapid atomic type analysis by abstract interpretation - application to JavaScript optimization. In CC, volume 6011 of LNCS, pages 66-83. Springer, 2010. Google Scholar
  22. Magnus Madsen and Esben Andreasen. String analysis for dynamic field access. In CC, volume 8409 of LNCS, pages 197-217. Springer, 2014. Google Scholar
  23. Daniel Marino and Todd D. Millstein. A generic type-and-effect system. In TLDI, pages 39-50. ACM, 2009. Google Scholar
  24. A. Miné, A. Ouadjaout, and M. Journault. Design of a modular platform for static analysis. In Proc. of 9h Workshop on Tools for Automatic Program Analysis (TAPAS'18), LNCS, page 4, 28 August 2018. Google Scholar
  25. Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, and Kwangkeun Yi. Design and implementation of sparse global analyses for C-like languages. In PLDI, pages 229-238. ACM, 2012. Google Scholar
  26. Daejun Park, Andrei Stefanescu, and Grigore Rosu. KJS: A complete formal semantics of JavaScript. In PLDI, pages 346-356. ACM, 2015. Google Scholar
  27. Joe Gibbs Politz, Alejandro Martinez, Matthew Milano, Sumner Warren, Daniel Patterson, Junsong Li, Anand Chitipothu, and Shriram Krishnamurthi. Python: The full monty. In OOPSLA, pages 217-232. ACM, 2013. Google Scholar
  28. Ranson, Hamilton, and Fong. A semantics of Python in Isabelle/HOL. Technical report, University of Regina, 2008. URL: http://www.cs.uregina.ca/Research/Techreports/2008-04.pdf.
  29. Grigore Rosu and Traian-Florin Serbanuta. An overview of the K semantic framework. J. Log. Algebr. Program., 79(6):397-434, 2010. Google Scholar
  30. Jeremy G. Siek and Walid Taha. Gradual typing for objects. In ECOOP, volume 4609 of LNCS, pages 2-27. Springer, 2007. Google Scholar
  31. Gideon Joachim Smeding. An executable operational semantics for Python. Universiteit Utrecht, 2009. URL: http://gideon.smdng.nl/wp-content/uploads/thesis.pdf.
  32. Thodoris Sotiropoulos and Benjamin Livshits. Static analysis for asynchronous JavaScript programs. In ECOOP, volume 134 of LIPIcs, pages 8:1-8:30. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2019. Google Scholar
  33. Fausto Spoto. The julia static analyzer for Java. In SAS, volume 9837 of LNCS, pages 39-57. Springer, 2016. Google Scholar
  34. Zhaogui Xu, Peng Liu, Xiangyu Zhang, and Baowen Xu. Python predictive analysis for bug detection. In SIGSOFT FSE, pages 121-132. ACM, 2016. Google Scholar
  35. Mypy. http://mypy-lang.org/, 2018. Accessed: 2018-07-22.
  36. Python enhancement proposal 484, about type hints. https://www.python.org/dev/peps/pep-0484/, 2018. Accessed: 2018-07-23.
  37. Typeshed. https://github.com/python/typeshed/, 2018. Accessed: 2018-07-22.
  38. Monkeytype. https://github.com/Instagram/MonkeyType, 2019. Accessed: 2019-10-22.
  39. Performance benchmarks from Python’s reference interpreter. https://github.com/python/pyperformance/, 2019. Accessed: 2019-10-22.
  40. Pyannotate. https://github.com/dropbox/pyannotate, 2019. Accessed: 2019-10-22.
  41. Pyre-check. https://github.com/facebook/pyre-check, 2019. Accessed: 2019-10-22.
  42. Pytype. https://github.com/google/pytype, 2019. Accessed: 2019-10-22.
  43. Pathpicker. https://github.com/facebook/pathpicker/, 2020. Accessed: 2020-01-03.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail