A CFL-Reachability Formulation of Callsite-Sensitive Pointer Analysis with Built-In On-The-Fly Call Graph Construction

Authors Dongjie He , Jingbo Lu , Jingling Xue



PDF
Thumbnail PDF

File

LIPIcs.ECOOP.2024.18.pdf
  • Filesize: 1.1 MB
  • 29 pages

Document Identifiers

Author Details

Dongjie He
  • University of New South Wales, Sydney, Australia
  • Chongqing University, China
Jingbo Lu
  • University of New South Wales, Sydney, Australia
  • Shanghai Sectrend Information Technology Co., Ltd, China
Jingling Xue
  • University of New South Wales, Sydney, Australia

Acknowledgements

We thank the anonymous reviewers for their constructive comments.

Cite AsGet BibTex

Dongjie He, Jingbo Lu, and Jingling Xue. A CFL-Reachability Formulation of Callsite-Sensitive Pointer Analysis with Built-In On-The-Fly Call Graph Construction. In 38th European Conference on Object-Oriented Programming (ECOOP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 313, pp. 18:1-18:29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ECOOP.2024.18

Abstract

In object-oriented languages, the traditional CFL-reachability formulation for k-callsite-sensitive pointer analysis (kCFA) focuses on modeling field accesses and calling contexts, but it relies on a separate algorithm for call graph construction. This division can result in a loss of precision in kCFA, a problem that persists even when using the most precise call graphs, whether pre-constructed or generated on the fly. Moreover, pre-analyses based on this framework aiming to improve the efficiency of kCFA may inadvertently reduce its precision, due to the framework’s lack of native call graph construction, essential for precise analysis. Addressing this gap, this paper introduces a novel CFL-reachability formulation of kCFA for Java, uniquely integrating on-the-fly call graph construction. This advancement not only addresses the precision loss inherent in the traditional CFL-reachability-based approach but also enhances its overall applicability. In a significant secondary contribution, we present the first precision-preserving pre-analysis to accelerate kCFA. This pre-analysis leverages selective context sensitivity to improve the efficiency of kCFA without sacrificing its precision. Collectively, these contributions represent a substantial step forward in pointer analysis, offering both theoretical and practical advancements that could benefit future developments in the field.

Subject Classification

ACM Subject Classification
  • Theory of computation → Program analysis
Keywords
  • Pointer Analysis
  • CFL Reachability
  • Call Graph Construction

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Lars Ole Andersen. Program analysis and specialization for the C programming language. PhD thesis, University of Cophenhagen, 1994. Google Scholar
  2. David F Bacon and Peter F Sweeney. Fast static analysis of c++ virtual function calls. In Proceedings of the 11th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, pages 324-341, New York, NY, USA, 1996. Association for Computing Machinery. Google Scholar
  3. Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khang, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanović, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, pages 169-190, New York, NY, USA, 2006. Association for Computing Machinery. Google Scholar
  4. Eric Bodden, Andreas Sewe, Jan Sinschek, Hela Oueslati, and Mira Mezini. Taming reflection: Aiding static analysis in the presence of reflection and custom class loaders. In Proceedings of the 33rd International Conference on Software Engineering, pages 241-250, Honolulu, HI, USA, 2011. IEEE. Google Scholar
  5. Martin Bravenboer and Yannis Smaragdakis. Strictly declarative specification of sophisticated points-to analyses. In Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications, pages 243-262, New York, NY, USA, 2009. Association for Computing Machinery. Google Scholar
  6. Krishnendu Chatterjee, Bhavya Choudhary, and Andreas Pavlogiannis. Optimal Dyck reachability for data-dependence and alias analysis. Proceedings of the ACM on Programming Languages, 2(POPL):1-30, 2017. Google Scholar
  7. Swarat Chaudhuri. Subcubic algorithms for recursive state machines. In Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 159-169, New York, NY, USA, 2008. Association for Computing Machinery. Google Scholar
  8. Yifan Chen, Chenyang Yang, Xin Zhang, Yingfei Xiong, Hao Tang, Xiaoyin Wang, and Lu Zhang. Accelerating program analyses in datalog by merging library facts. In International Static Analysis Symposium, pages 77-101, Cham, 2021. Springer, Springer International Publishing. Google Scholar
  9. Jeffrey Dean, David Grove, and Craig Chambers. Optimization of object-oriented programs using static class hierarchy analysis. In European Conference on Object-Oriented Programming, pages 77-101, Berlin, Heidelberg, 1995. Springer, Springer Berlin Heidelberg. Google Scholar
  10. Yu Feng, Xinyu Wang, Isil Dillig, and Thomas Dillig. Bottom-up context-sensitive pointer analysis for Java. In Programming Languages and Systems: 13th Asian Symposium, APLAS 2015, Pohang, South Korea, November 30-December 2, 2015, Proceedings, pages 465-484, Cham, 2015. Springer International Publishing. Google Scholar
  11. David Grove and Craig Chambers. A framework for call graph construction algorithms. ACM Transactions on Programming Languages and Systems (TOPLAS), 23(6):685-746, 2001. Google Scholar
  12. Behnaz Hassanshahi, Raghavendra Kagalavadi Ramesh, Padmanabhan Krishnan, Bernhard Scholz, and Yi Lu. An efficient tunable selective points-to analysis for large codebases. In Proceedings of the 6th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis, pages 13-18, New York, NY, USA, 2017. Association for Computing Machinery. Google Scholar
  13. Dongjie He, Yujiang Gui, Wei Li, Yonggang Tao, Changwei Zou, Yulei Sui, and Jingling Xue. A container-usage-pattern-based context debloating approach for object-sensitive pointer analysis. Proceedings of the ACM on Programming Languages, 7(OOPSLA2):971-1000, 2023. Google Scholar
  14. Dongjie He, Jingbo Lu, Yaoqing Gao, and Jingling Xue. Accelerating object-sensitive pointer analysis by exploiting object containment and reachability. In Proceedings of the 35th European Conference on Object-Oriented Programming (ECOOP 2021), pages 18:1-18:31, Dagstuhl, Germany, 2021. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. Google Scholar
  15. Dongjie He, Jingbo Lu, and Jingling Xue. A CFL-Reachability Formulation of Callsite-Sensitive Pointer Analysis with Built-in On-the-Fly Call Graph Construction (Artifact). Software, version 1.0. (visited on 2024-08-27). URL: https://doi.org/10.5281/zenodo.11061892.
  16. Dongjie He, Jingbo Lu, and Jingling Xue. Context debloating for object-sensitive pointer analysis. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 79-91, New York, NY, USA, 2021. IEEE. URL: https://doi.org/10.1109/ASE51524.2021.9678880.
  17. Dongjie He, Jingbo Lu, and Jingling Xue. Qilin: A new framework for supporting fine-grained context-sensitivity in Java pointer analysis. In Karim Ali and Jan Vitek, editors, 36th European Conference on Object-Oriented Programming (ECOOP 2022), volume 222 of Leibniz International Proceedings in Informatics (LIPIcs), pages 30:1-30:29, Dagstuhl, Germany, 2022. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. URL: https://doi.org/10.4230/LIPIcs.ECOOP.2022.30.
  18. Dongjie He, Jingbo Lu, and Jingling Xue. IFDS-based context debloating for object-sensitive pointer analysis. ACM Transactions on Software Engineering and Methodology, 2023. Google Scholar
  19. Dongjie He, Jingbo Lu, and Jingling Xue. A CFL-reachability formulation of callsite- sensitive pointer analysis with built-in on-the- fly call graph construction (artifact), July 2024. URL: https://doi.org/10.5281/zenodo.11061892.
  20. Minseok Jeon, Sehun Jeong, and Hakjoo Oh. Precise and scalable points-to analysis via data-driven context tunneling. Proceedings of the ACM on Programming Languages, 2(OOPSLA):1-29, 2018. Google Scholar
  21. Sehun Jeong, Minseok Jeon, Sungdeok Cha, and Hakjoo Oh. Data-driven context-sensitivity for points-to analysis. Proceedings of the ACM on Programming Languages, 1(OOPSLA):100, 2017. Google Scholar
  22. George Kastrinis and Yannis Smaragdakis. Hybrid context-sensitivity for points-to analysis. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 423-434, New York, NY, USA, 2013. Association for Computing Machinery. Google Scholar
  23. John Kodumal and Alex Aiken. The set constraint/CFL reachability connection in practice. ACM Sigplan Notices, 39(6):207-218, 2004. Google Scholar
  24. VL Arlazarov EA Dinic MA Kronrod and IA Faradzev. On economic construction of the transitive closure of a directred graph. In Dokl. Acad. Nauk SSSR, pages 487-88, 1970. Google Scholar
  25. Michael John Latta. The intersection of context-free languages. PhD thesis, University of Texas at Austin, USA, 1993. URL: https://www.proquest.com/docview/304086568?pq-origsite=gscholar&fromopenview=true.
  26. Ondřej Lhoták and Laurie Hendren. Scaling Java points-to analysis using Spark. In International Conference on Compiler Construction, pages 153-169, Berlin, Heidelberg, 2003. Springer Berlin Heidelberg. Google Scholar
  27. Ondřej Lhoták and Laurie Hendren. Evaluating the benefits of context-sensitive points-to analysis using a bdd-based implementation. ACM Transactions on Software Engineering and Methodology (TOSEM), 18(1):1-53, 2008. Google Scholar
  28. Yuanbo Li, Qirun Zhang, and Thomas Reps. Fast graph simplification for interleaved Dyck-reachability. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 780-793, New York, NY, USA, 2020. Association for Computing Machinery. Google Scholar
  29. Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis. Precision-guided context sensitivity for pointer analysis. Proceedings of the ACM on Programming Languages, 2(OOPSLA):1-29, 2018. Google Scholar
  30. Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis. A principled approach to selective context sensitivity for pointer analysis. ACM Transactions on Programming Languages and Systems, 42(TOPLAS):1-40, 2020. Google Scholar
  31. Leonard Y Liu and Peter Weiner. An infinite hierarchy of intersections of context-free languages. Mathematical systems theory, 7:185-192, 1973. URL: https://doi.org/10.1007/BF01762237.
  32. Jingbo Lu, Dongjie He, and Jingling Xue. Eagle: CFL-reachability-based precision-preserving acceleration of object-sensitive pointer analysis with partial context sensitivity. ACM Transactions on Software Engineering and Methodology (TOSEM), 30(4):1-46, 2021. Google Scholar
  33. Jingbo Lu, Dongjie He, and Jingling Xue. Selective context-sensitivity for k-CFA with CFL-reachability. In International Static Analysis Symposium, pages 261-285, Cham, 2021. Springer, Springer International Publishing. Google Scholar
  34. Jingbo Lu, Dongjie He, and Jingling Xue. Selective context-sensitivity for k-CFA with CFL-reachability (artifact), July 2021. URL: https://doi.org/10.5281/zenodo.4732680.
  35. Jingbo Lu and Jingling Xue. Precision-preserving yet fast object-sensitive pointer analysis with partial context sensitivity. Proceedings of the ACM on Programming Languages, 3(OOPSLA):1-29, 2019. Google Scholar
  36. Ana Milanova. FlowCFL: generalized type-based reachability analysis: graph reduction and equivalence of CFL-based and type-based reachability. Proceedings of the ACM on Programming Languages, 4(OOPSLA):1-29, 2020. Google Scholar
  37. Ana Milanova, Wei Huang, and Yao Dong. CFL-reachability and context-sensitive integrity types. In Proceedings of the 2014 International Conference on Principles and Practices of Programming on the Java platform: Virtual machines, Languages, and Tools, pages 99-109, New York, NY, USA, 2014. Association for Computing Machinery. Google Scholar
  38. Ana Milanova, Atanas Rountev, and Barbara G Ryder. Parameterized object sensitivity for points-to and side-effect analyses for Java. In Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis, pages 1-11, New York, NY, USA, 2002. Association for Computing Machinery. Google Scholar
  39. Ana Milanova, Atanas Rountev, and Barbara G Ryder. Parameterized object sensitivity for points-to analysis for Java. ACM Transactions on Software Engineering and Methodology, 14(1):1-41, 2005. Google Scholar
  40. Mayur Naik, Alex Aiken, and John Whaley. Effective static race detection for Java. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 308-319, New York, NY, USA, 2006. Association for Computing Machinery. Google Scholar
  41. Polyvios Pratikakis, Jeffrey S Foster, and Michael Hicks. Existential label flow inference via CFL reachability. In International Static Analysis Symposium, pages 88-106, Berlin, Heidelberg, 2006. Springer, Springer Berlin Heidelberg. Google Scholar
  42. Mukund Raghothaman, Sulekha Kulkarni, Kihong Heo, and Mayur Naik. User-guided program reasoning using bayesian inference. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 722-735, New York, NY, USA, 2018. Association for Computing Machinery. Google Scholar
  43. Jakob Rehof and Manuel Fähndrich. Type-based flow analysis: from polymorphic subtyping to CFL-reachability. ACM SIGPLAN Notices, 36(3):54-66, 2001. Google Scholar
  44. Thomas Reps. Program analysis via graph reachability. Information and software technology, 40(11-12):701-726, 1998. Google Scholar
  45. Thomas Reps. Undecidability of context-sensitive data-dependence analysis. ACM Transactions on Programming Languages and Systems, 22(1):162-186, 2000. Google Scholar
  46. Thomas Reps, Susan Horwitz, and Mooly Sagiv. Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 49-61, New York, NY, USA, 1995. Association for Computing Machinery. Google Scholar
  47. Barbara G Ryder. Dimensions of precision in reference analysis of object-oriented programming languages. In International Conference on Compiler Construction, pages 126-137, Berlin, Heidelberg, 2003. Springer, Springer Berlin Heidelberg. Google Scholar
  48. Lei Shang, Xinwei Xie, and Jingling Xue. On-demand dynamic summary-based points-to analysis. In Proceedings of the Tenth International Symposium on Code Generation and Optimization, pages 264-274, New York, NY, USA, 2012. Association for Computing Machinery. Google Scholar
  49. Olin Grigsby Shivers. Control-flow analysis of higher-order languages or taming lambda. PhD thesis, Carnegie Mellon University, 1991. CMU-CS-91-145. Google Scholar
  50. Yannis Smaragdakis, Martin Bravenboer, and Ondrej Lhoták. Pick your contexts well: understanding object-sensitivity. In Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 17-30, New York, NY, USA, 2011. Association for Computing Machinery. Google Scholar
  51. Yannis Smaragdakis, George Kastrinis, and George Balatsouras. Introspective analysis: context-sensitivity, across the board. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 485-495, New York, NY, USA, 2014. Association for Computing Machinery. Google Scholar
  52. Manu Sridharan. Refinement-based program analysis tools. University of California, Berkeley, 2007. Google Scholar
  53. Manu Sridharan and Rastislav Bodík. Refinement-based context-sensitive points-to analysis for Java. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 387-400, New York, NY, USA, 2006. Association for Computing Machinery. Google Scholar
  54. Manu Sridharan, Denis Gopan, Lexin Shan, and Rastislav Bodík. Demand-driven points-to analysis for Java. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 59-76, New York, NY, USA, 2005. Association for Computing Machinery. Google Scholar
  55. Vijay Sundaresan, Laurie Hendren, Chrislain Razafimahefa, Raja Vallée-Rai, Patrick Lam, Etienne Gagnon, and Charles Godin. Practical virtual method call resolution for Java. ACM SIGPLAN Notices, 35(10):264-280, 2000. Google Scholar
  56. Hao Tang, Xiaoyin Wang, Lingming Zhang, Bing Xie, Lu Zhang, and Hong Mei. Summary-based context-sensitive data-dependence analysis in presence of callbacks. In Proceedings of the 42Nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 83-95, New York, NY, USA, 2015. Association for Computing Machinery. Google Scholar
  57. Rei Thiessen and Ondřej Lhoták. Context transformations for pointer analysis. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 263-277, New York, NY, USA, 2017. Association for Computing Machinery. Google Scholar
  58. Tian Tan, Yue Li and Jingling Xue. Efficient and precise points-to analysis: modeling the heap by merging equivalent automata. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 278-291, New York, NY, USA, 2017. Association for Computing Machinery. Google Scholar
  59. Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie Hendren, Patrick Lam, and Vijay Sundaresan. Soot: A Java bytecode optimization framework. In CASCON First Decade High Impact Papers, pages 214-224. IBM Corp., USA, 2010. Google Scholar
  60. WALA. WALA: T.J. Watson Libraries for Analysis, 2024. URL: https://github.com/wala/WALA.
  61. Guoqing Xu, Atanas Rountev, and Manu Sridharan. Scaling CFL-reachability-based points-to analysis using context-sensitive must-not-alias analysis. In European Conference on Object-Oriented Programming, pages 98-122, Berlin, Heidelberg, 2009. Springer, Springer Berlin Heidelberg. Google Scholar
  62. Dacong Yan, Guoqing Xu, and Atanas Rountev. Demand-driven context-sensitive alias analysis for Java. In Proceedings of the 2011 International Symposium on Software Testing and Analysis, pages 155-165, New York, NY, USA, 2011. Association for Computing Machinery. Google Scholar
  63. Qirun Zhang, Michael R Lyu, Hao Yuan, and Zhendong Su. Fast algorithms for Dyck-CFL-reachability with applications to alias analysis. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 435-446, New York, NY, USA, 2013. Association for Computing Machinery. Google Scholar
  64. Xin Zheng and Radu Rugina. Demand-driven alias analysis for c. In Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 197-208, New York, NY, USA, 2008. Association for Computing Machinery. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail