DROPS

Document

DOI: 10.4230/LIPIcs.ECOOP.2023.37

Do Machine Learning Models Produce TypeScript Types That Type Check?

Authors: Ming-Ho Yee and Arjun Guha

Published in: LIPIcs, Volume 263, 37th European Conference on Object-Oriented Programming (ECOOP 2023)

Abstract

Type migration is the process of adding types to untyped code to gain assurance at compile time. TypeScript and other gradual type systems facilitate type migration by allowing programmers to start with imprecise types and gradually strengthen them. However, adding types is a manual effort and several migrations on large, industry codebases have been reported to have taken several years. In the research community, there has been significant interest in using machine learning to automate TypeScript type migration. Existing machine learning models report a high degree of accuracy in predicting individual TypeScript type annotations. However, in this paper we argue that accuracy can be misleading, and we should address a different question: can an automatic type migration tool produce code that passes the TypeScript type checker? We present TypeWeaver, a TypeScript type migration tool that can be used with an arbitrary type prediction model. We evaluate TypeWeaver with three models from the literature: DeepTyper, a recurrent neural network; LambdaNet, a graph neural network; and InCoder, a general-purpose, multi-language transformer that supports fill-in-the-middle tasks. Our tool automates several steps that are necessary for using a type prediction model, including (1) importing types for a project’s dependencies; (2) migrating JavaScript modules to TypeScript notation; (3) inserting predicted type annotations into the program to produce TypeScript when needed; and (4) rejecting non-type predictions when needed. We evaluate TypeWeaver on a dataset of 513 JavaScript packages, including packages that have never been typed before. With the best type prediction model, we find that only 21% of packages type check, but more encouragingly, 69% of files type check successfully.

Cite as

Ming-Ho Yee and Arjun Guha. Do Machine Learning Models Produce TypeScript Types That Type Check?. In 37th European Conference on Object-Oriented Programming (ECOOP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 263, pp. 37:1-37:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{yee_et_al:LIPIcs.ECOOP.2023.37,
  author =	{Yee, Ming-Ho and Guha, Arjun},
  title =	{{Do Machine Learning Models Produce TypeScript Types That Type Check?}},
  booktitle =	{37th European Conference on Object-Oriented Programming (ECOOP 2023)},
  pages =	{37:1--37:28},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-281-5},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{263},
  editor =	{Ali, Karim and Salvaneschi, Guido},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2023.37},
  URN =		{urn:nbn:de:0030-drops-182307},
  doi =		{10.4230/LIPIcs.ECOOP.2023.37},
  annote =	{Keywords: Type migration, deep learning}
}

Document

Artifact

DOI: 10.4230/DARTS.9.2.5

Do Machine Learning Models Produce TypeScript Types That Type Check? (Artifact)

Authors: Ming-Ho Yee and Arjun Guha

Published in: DARTS, Volume 9, Issue 2, Special Issue of the 37th European Conference on Object-Oriented Programming (ECOOP 2023)

Abstract

Type migration is the process of adding types to untyped code to gain assurance at compile time. TypeScript and other gradual type systems facilitate type migration by allowing programmers to start with imprecise types and gradually strengthen them. However, adding types is a manual effort and several migrations on large, industry codebases have been reported to have taken several years. In the research community, there has been significant interest in using machine learning to automate TypeScript type migration. Existing machine learning models report a high degree of accuracy in predicting individual TypeScript type annotations. However, in this paper we argue that accuracy can be misleading, and we should address a different question: can an automatic type migration tool produce code that passes the TypeScript type checker? We present TypeWeaver, a TypeScript type migration tool that can be used with an arbitrary type prediction model. We evaluate TypeWeaver with three models from the literature: DeepTyper, a recurrent neural network; LambdaNet, a graph neural network; and InCoder, a general-purpose, multi-language transformer that supports fill-in-the-middle tasks. Our tool automates several steps that are necessary for using a type prediction model, including (1) importing types for a project’s dependencies; (2) migrating JavaScript modules to TypeScript notation; (3) inserting predicted type annotations into the program to produce TypeScript when needed; and (4) rejecting non-type predictions when needed. We evaluate TypeWeaver on a dataset of 513 JavaScript packages, including packages that have never been typed before. With the best type prediction model, we find that only 21% of packages type check, but more encouragingly, 69% of files type check successfully.

Cite as

Ming-Ho Yee and Arjun Guha. Do Machine Learning Models Produce TypeScript Types That Type Check? (Artifact). In Special Issue of the 37th European Conference on Object-Oriented Programming (ECOOP 2023). Dagstuhl Artifacts Series (DARTS), Volume 9, Issue 2, pp. 5:1-5:3, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@Article{yee_et_al:DARTS.9.2.5,
  author =	{Yee, Ming-Ho and Guha, Arjun},
  title =	{{Do Machine Learning Models Produce TypeScript Types That Type Check? (Artifact)}},
  pages =	{5:1--5:3},
  journal =	{Dagstuhl Artifacts Series},
  ISSN =	{2509-8195},
  year =	{2023},
  volume =	{9},
  number =	{2},
  editor =	{Yee, Ming-Ho and Guha, Arjun},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DARTS.9.2.5},
  URN =		{urn:nbn:de:0030-drops-182456},
  doi =		{10.4230/DARTS.9.2.5},
  annote =	{Keywords: Type migration, deep learning}
}

Document

DOI: 10.4230/LIPIcs.SNAPL.2017.5

Fission: Secure Dynamic Code-Splitting for JavaScript

Authors: Arjun Guha, Jean-Baptiste Jeannin, Rachit Nigam, Jane Tangen, and Rian Shambaugh

Published in: LIPIcs, Volume 71, 2nd Summit on Advances in Programming Languages (SNAPL 2017)

Abstract

Traditional web programming involves the creation of two distinct programs: a client-side front-end, a server-side back-end, and a lot of communications boilerplate. An alternative approach is to use a tierless programming model, where a single program describes the behavior of both the client and the server, and the runtime system takes care of communication. Unfortunately, this usually entails adopting a new language and thus abandoning well-worn libraries and web programming tools. In this paper, we present our ongoing work on Fission, a platform that uses dynamic tier-splitting and dynamic information flow control to transparently run a single JavaScript program across the client and server. Although static tier-splitting has been studied before, our focus on dynamic approaches presents several new challenges and opportunities. For example, Fission supports characteristic JavaScript features such as eval and sophisticated JavaScript libraries like React. Therefore, programmers can reason about the integrity and confidentiality of information while continuing to use common libraries and programming patterns. Moreover, by unifying the client and server into a single program, Fission allows language-based tools, like type systems and IDEs, to manipulate complete web applications. To illustrate, we use TypeScript to ensure that client-server communication does not go wrong.

Cite as

Arjun Guha, Jean-Baptiste Jeannin, Rachit Nigam, Jane Tangen, and Rian Shambaugh. Fission: Secure Dynamic Code-Splitting for JavaScript. In 2nd Summit on Advances in Programming Languages (SNAPL 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 71, pp. 5:1-5:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{guha_et_al:LIPIcs.SNAPL.2017.5,
  author =	{Guha, Arjun and Jeannin, Jean-Baptiste and Nigam, Rachit and Tangen, Jane and Shambaugh, Rian},
  title =	{{Fission: Secure Dynamic Code-Splitting for JavaScript}},
  booktitle =	{2nd Summit on Advances in Programming Languages (SNAPL 2017)},
  pages =	{5:1--5:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-032-3},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{71},
  editor =	{Lerner, Benjamin S. and Bod{\'\i}k, Rastislav and Krishnamurthi, Shriram},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SNAPL.2017.5},
  URN =		{urn:nbn:de:0030-drops-71247},
  doi =		{10.4230/LIPIcs.SNAPL.2017.5},
  annote =	{Keywords: JavaScript, information flow control}
}

3 Search Results for "Guha, Arjun"

Do Machine Learning Models Produce TypeScript Types That Type Check?

Abstract

Cite as

Do Machine Learning Models Produce TypeScript Types That Type Check? (Artifact)

Abstract

Cite as

Fission: Secure Dynamic Code-Splitting for JavaScript

Abstract

Cite as

Thanks for your feedback!

Could not send message