Do Machine Learning Models Produce TypeScript Types That Type Check?

Authors Ming-Ho Yee , Arjun Guha



PDF
Thumbnail PDF

File

LIPIcs.ECOOP.2023.37.pdf
  • Filesize: 0.98 MB
  • 28 pages

Document Identifiers

Author Details

Ming-Ho Yee
  • Northeastern University, Boston, MA, USA
Arjun Guha
  • Northeastern University, Boston, MA, USA
  • Roblox Research, San Mateo, CA, USA

Acknowledgements

We thank Northeastern Research Computing and the New England Research Cloud for providing computing resources; and Leif Andersen, Luna Phipps-Costin, Donald Pinckney, and the anonymous reviewers for their feedback.

Cite AsGet BibTex

Ming-Ho Yee and Arjun Guha. Do Machine Learning Models Produce TypeScript Types That Type Check?. In 37th European Conference on Object-Oriented Programming (ECOOP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 263, pp. 37:1-37:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.ECOOP.2023.37

Abstract

Type migration is the process of adding types to untyped code to gain assurance at compile time. TypeScript and other gradual type systems facilitate type migration by allowing programmers to start with imprecise types and gradually strengthen them. However, adding types is a manual effort and several migrations on large, industry codebases have been reported to have taken several years. In the research community, there has been significant interest in using machine learning to automate TypeScript type migration. Existing machine learning models report a high degree of accuracy in predicting individual TypeScript type annotations. However, in this paper we argue that accuracy can be misleading, and we should address a different question: can an automatic type migration tool produce code that passes the TypeScript type checker? We present TypeWeaver, a TypeScript type migration tool that can be used with an arbitrary type prediction model. We evaluate TypeWeaver with three models from the literature: DeepTyper, a recurrent neural network; LambdaNet, a graph neural network; and InCoder, a general-purpose, multi-language transformer that supports fill-in-the-middle tasks. Our tool automates several steps that are necessary for using a type prediction model, including (1) importing types for a project’s dependencies; (2) migrating JavaScript modules to TypeScript notation; (3) inserting predicted type annotations into the program to produce TypeScript when needed; and (4) rejecting non-type predictions when needed. We evaluate TypeWeaver on a dataset of 513 JavaScript packages, including packages that have never been typed before. With the best type prediction model, we find that only 21% of packages type check, but more encouragingly, 69% of files type check successfully.

Subject Classification

ACM Subject Classification
  • Software and its engineering → Source code generation
  • General and reference → Evaluation
  • Theory of computation → Type structures
Keywords
  • Type migration
  • deep learning

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Jong-hoon (David) An, Avik Chaudhuri, Jeffrey S. Foster, and Michael Hicks. Dynamic Inference of Static Types for Ruby. In Principles of Programming Languages (POPL), 2011. URL: https://doi.org/10.1145/1926385.1926437.
  2. Christopher Anderson, Paola Giannini, and Sophia Drossopoulou. Towards Type Inference for JavaScript. In European Conference on Object-Oriented Programming (ECOOP), 2005. URL: https://doi.org/10.1007/11531142_19.
  3. Luke Autry. How we failed, then succeeded, at migrating to TypeScript. https://heap.io/blog/migrating-to-typescript, 2019. Accessed: 2022-12-01.
  4. Mohammad Bavarian, Heewoo Jun, Nikolas Tezak, John Schulman, Christine McLeavey, Jerry Tworek, and Mark Chen. Efficient Training of Language Models to Fill in the Middle, 2022. URL: https://doi.org/10.48550/arXiv.2207.14255.
  5. Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, and Leandro von Werra. SantaCoder: don't reach for the stars!, 2023. URL: https://doi.org/10.48550/arXiv.2301.03988.
  6. Gavin Bierman, Martín Abadi, and Mads Torgersen. Understanding TypeScript. In European Conference on Object-Oriented Programming (ECOOP), 2014. URL: https://doi.org/10.1007/978-3-662-44202-9_11.
  7. Ambrose Bonnaire-Sergeant, Rowan Davies, and Sam Tobin-Hochstadt. Practical Optional Types for Clojure. In European Symposium on Programming (ESOP), 2016. URL: https://doi.org/10.1007/978-3-662-49498-1_4.
  8. Ryan Burgess, Joe King, Stacy London, Sumana Mohan, and Jem Young. TypeScript migration - Strict type of cocktails. https://frontendhappyhour.com/episodes/typescript-migration-strict-type-of-cocktails, 2022. Accessed: 2022-12-01.
  9. John Peter Campora, Sheng Chen, Martin Erwig, and Eric Walkingshaw. Migrating Gradual Types. Proc. ACM Program. Lang., 2(POPL), 2018. URL: https://doi.org/10.1145/3158103.
  10. Robert Cartwright and Mike Fagan. Soft Typing. In Programming Language Design and Implementation (PLDI), 1991. URL: https://doi.org/10.1145/113445.113469.
  11. Mauricio Cassola, Agustín Talagorria, Alberto Pardo, and Marcos Viera. A Gradual Type System for Elixir. In Brazilian Symposium on Context-Oriented Programming and Advanced Modularity (SBLP), 2020. URL: https://doi.org/10.1145/3427081.3427084.
  12. Giuseppe Castagna, Victor Lanvin, Tommaso Petrucciani, and Jeremy G. Siek. Gradual Typing: A New Perspective. Proc. ACM Program. Lang., 3(POPL), 2019. URL: https://doi.org/10.1145/3290329.
  13. Satish Chandra, Colin S. Gordon, Jean-Baptiste Jeannin, Cole Schlesinger, Manu Sridharan, Frank Tip, and Youngil Choi. Type Inference for Static Compilation of JavaScript. In Object-Oriented Programming Systems Languages and Applications (OOPSLA), 2016. URL: https://doi.org/10.1145/2983990.2984017.
  14. Avik Chaudhuri, Panagiotis Vekris, Sam Goldman, Marshall Roch, and Gabriel Levi. Fast and Precise Type Checking for JavaScript. Proc. ACM Program. Lang., 1(OOPSLA), 2017. URL: https://doi.org/10.1145/3133872.
  15. Ravi Chugh, David Herman, and Ranjit Jhala. Dependent Types for JavaScript. In Object-Oriented Programming Systems Languages and Applications (OOPSLA), 2012. URL: https://doi.org/10.1145/2384616.2384659.
  16. Asger Feldthaus and Anders Møller. Checking Correctness of TypeScript Interfaces for JavaScript Libraries. In Object-Oriented Programming Systems Languages and Applications (OOPSLA), 2014. URL: https://doi.org/10.1145/2660193.2660215.
  17. Cormac Flanagan. Effective Static Debugging via Componential Set-based Analysis. PhD thesis, Rice University, 1997. URL: https://users.soe.ucsc.edu/~cormac/papers/thesis.pdf.
  18. Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Scott Yih, Luke Zettlemoyer, and Mike Lewis. InCoder: A Generative Model for Code Infilling and Synthesis. In International Conference on Learning Representations (ICLR), 2023. URL: https://doi.org/10.48550/arXiv.2204.05999.
  19. Michael Furr, Jong-hoon (David) An, and Jeffrey S. Foster. Profile-Guided Static Typing for Dynamic Scripting Languages. In Object-Oriented Programming Systems Languages and Applications (OOPSLA), 2009. URL: https://doi.org/10.1145/1640089.1640110.
  20. Michael Furr, Jong-hoon (David) An, Jeffrey S. Foster, and Michael Hicks. Static Type Inference for Ruby. In Symposium on Applied Computing (SAC), 2009. URL: https://doi.org/10.1145/1529282.1529700.
  21. Ronald Garcia and Matteo Cimini. Principal Type Schemes for Gradual Programs. In Principles of Programming Languages (POPL), 2015. URL: https://doi.org/10.1145/2676726.2676992.
  22. Arjun Guha, Claudiu Saftoiu, and Shriram Krishnamurthi. Typing Local Control and State Using Flow Analysis. In European Symposium on Programming (ESOP), 2011. URL: https://doi.org/10.1007/978-3-642-19718-5_14.
  23. Vincent J. Hellendoorn, Christian Bird, Earl T. Barr, and Miltiadis Allamanis. Deep Learning Type Inference. In European Software Engineering Conference/Foundations of Software Engineering (ESEC/FSE), 2018. URL: https://doi.org/10.1145/3236024.3236051.
  24. Kevin Jesse, Premkumar Devanbu, and Anand Ashok Sawant. Learning To Predict User-Defined Types. IEEE Transactions on Software Engineering (TSE), 2022. URL: https://doi.org/10.1109/TSE.2022.3178945.
  25. Kevin Jesse, Premkumar T. Devanbu, and Toufique Ahmed. Learning Type Annotation: Is Big Data Enough? In European Software Engineering Conference/Foundations of Software Engineering (ESEC/FSE), 2021. URL: https://doi.org/10.1145/3468264.3473135.
  26. Milod Kazerounian, Jeffrey S. Foster, and Bonan Min. SimTyper: Sound Type Inference for Ruby Using Type Equality Prediction. Proc. ACM Program. Lang., 5(OOPSLA), 2021. URL: https://doi.org/10.1145/3485483.
  27. Milod Kazerounian, Brianna M. Ren, and Jeffrey S. Foster. Sound, Heuristic Type Annotation Inference for Ruby. In Dynamic Languages Symposium (DLS), 2020. URL: https://doi.org/10.1145/3426422.3426985.
  28. Erik Krogh Kristensen and Anders Møller. Inference and Evolution of TypeScript Declaration Files. In Fundamental Approaches to Software Engineering (FASE), 2017. URL: https://doi.org/10.1007/978-3-662-54494-5_6.
  29. Erik Krogh Kristensen and Anders Møller. Type Test Scripts for TypeScript Testing. Proc. ACM Program. Lang., 1(OOPSLA), 2017. URL: https://doi.org/10.1145/3133914.
  30. Benjamin S. Lerner, Joe Gibbs Politz, Arjun Guha, and Shriram Krishnamurthi. TeJaS: Retrofitting Type Systems for JavaScript. In Dynamic Languages Symposium (DLS), 2013. URL: https://doi.org/10.1145/2578856.2508170.
  31. Kuang-Chen Lu, Ben Greenman, Carl Meyer, Dino Viehland, Aniket Panse, and Shriram Krishnamurthi. Gradual Soundness: Lessons from Static Python. The Art, Science, and Engineering of Programming, 7(1), 2022. URL: https://doi.org/10.22152/programming-journal.org/2023/7/2.
  32. Rabee Sohail Malik, Jibesh Patra, and Michael Pradel. NL2Type: Inferring JavaScript Function Types from Natural Language Information. In International Conference on Software Engineering (ICSE), 2019. URL: https://doi.org/10.1109/ICSE.2019.00045.
  33. Meta Platforms, Inc. Pyre: A performant type-checker for Python 3. https://pyre-check.org/. Accessed: 2022-12-01.
  34. Zeina Migeed and Jens Palsberg. What Is Decidable about Gradual Types? Proc. ACM Program. Lang., 4(POPL), 2020. URL: https://doi.org/10.1145/3371097.
  35. Yusuke Miyazaki, Taro Sekiyama, and Atsushi Igarashi. Dynamic Type Inference for Gradual Hindley-Milner Typing. Proc. ACM Program. Lang., 3(POPL), 2019. URL: https://doi.org/10.1145/3290331.
  36. Thomas Moore. How We Completed a (Partial) TypeScript Migration In Six Months. https://blog.abacus.com/how-we-completed-a-partial-typescript-migration-in-six-months/, 2019. Accessed: 2022-12-01.
  37. Guilherme Ottoni. HHVM JIT: A Profile-Guided, Region-Based Compiler for PHP and Hack. In Programming Language Design and Implementation (PLDI), 2018. URL: https://doi.org/10.1145/3192366.3192374.
  38. Irene Vlassi Pandi, Earl T. Barr, Andrew D. Gordon, and Charles Sutton. OptTyper: Probabilistic Type Inference by Optimising Logical and Natural Constraints, 2021. URL: https://doi.org/10.48550/arXiv.2004.00348.
  39. Mihai Parparita. The Road to TypeScript at Quip, Part Two. https://quip.com/blog/the-road-to-typescript-at-quip-part-two, 2020. Accessed: 2022-12-01.
  40. Luna Phipps-Costin, Carolyn Jane Anderson, Michael Greenberg, and Arjun Guha. Solver-Based Gradual Type Migration. Proc. ACM Program. Lang., 5(OOPSLA), 2021. URL: https://doi.org/10.1145/3485488.
  41. Michael Pradel, Georgios Gousios, Jason Liu, and Satish Chandra. TypeWriter: Neural Type Prediction with Search-Based Validation. In European Software Engineering Conference/Foundations of Software Engineering (ESEC/FSE), 2020. URL: https://doi.org/10.1145/3368089.3409715.
  42. Aseem Rastogi, Avik Chaudhuri, and Basil Hosmer. The Ins and Outs of Gradual Type Inference. In Principles of Programming Languages (POPL), 2012. URL: https://doi.org/10.1145/2103656.2103714.
  43. Felix Rieseberg. TypeScript at Slack. https://slack.engineering/typescript-at-slack/, 2017. Accessed: 2022-12-01.
  44. Sergii Rudenko. ts-migrate: A Tool for Migrating to TypeScript at Scale. https://medium.com/airbnb-engineering/ts-migrate-a-tool-for-migrating-to-typescript-at-scale-cd23bfeb5cc, 2020. Accessed: 2022-12-01.
  45. Claudiu Saftoiu. JSTrace: Run-time Type Discovery for JavaScript. Master’s thesis, Brown University, 2010. URL: https://cs.brown.edu/research/pubs/theses/ugrad/2010/saftoiu.pdf.
  46. Jeremy G. Siek and Walid Taha. Gradual Typing for Functional Languages. In Scheme and Functional Programming Workshop, 2006. URL: http://schemeworkshop.org/2006/13-siek.pdf.
  47. Jeremy G. Siek and Manish Vachharajani. Gradual Typing with Unification-Based Inference. In Dynamic Languages Symposium (DLS), 2008. URL: https://doi.org/10.1145/1408681.1408688.
  48. Sam Tobin-Hochstadt and Matthias Felleisen. The Design and Implementation of Typed Scheme. In Principles of Programming Languages (POPL), 2008. URL: https://doi.org/10.1145/1328438.1328486.
  49. Panagiotis Vekris, Benjamin Cosman, and Ranjit Jhala. Trust, but Verify: Two-Phase Typing for Dynamic Languages. In European Conference on Object-Oriented Programming (ECOOP), 2015. URL: https://doi.org/10.4230/LIPIcs.ECOOP.2015.52.
  50. Jiayi Wei, Maruth Goyal, Greg Durrett, and Isil Dillig. LambdaNet: Probabilistic Type Inference using Graph Neural Networks. In International Conference on Learning Representations (ICLR), 2020. URL: https://doi.org/10.48550/arXiv.2005.02161.
  51. Jack Williams, J. Garrett Morris, Philip Wadler, and Jakub Zalewski. Mixed Messages: Measuring Conformance and Non-Interference in TypeScript. In European Conference on Object-Oriented Programming (ECOOP), 2017. URL: https://doi.org/10.4230/LIPIcs.ECOOP.2017.28.
  52. Jake Zimmerman. Sorbet: Stripe’s type checker for Ruby. https://stripe.com/blog/sorbet-stripes-type-checker-for-ruby, 2022. Accessed: 2022-12-01.