A Verified LL(1) Parser Generator

Authors Sam Lasser, Chris Casinghino, Kathleen Fisher, Cody Roux



PDF
Thumbnail PDF

File

LIPIcs.ITP.2019.24.pdf
  • Filesize: 0.49 MB
  • 18 pages

Document Identifiers

Author Details

Sam Lasser
  • Tufts University, Medford, MA, USA
Chris Casinghino
  • Draper, Cambridge, MA, USA
Kathleen Fisher
  • Tufts University, Medford, MA, USA
Cody Roux
  • Draper, Cambridge, MA, USA

Acknowledgements

We thank our anonymous reviewers for their helpful feedback.

Cite AsGet BibTex

Sam Lasser, Chris Casinghino, Kathleen Fisher, and Cody Roux. A Verified LL(1) Parser Generator. In 10th International Conference on Interactive Theorem Proving (ITP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 141, pp. 24:1-24:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.ITP.2019.24

Abstract

An LL(1) parser is a recursive descent algorithm that uses a single token of lookahead to build a grammatical derivation for an input sequence. We present an LL(1) parser generator that, when applied to grammar G, produces an LL(1) parser for G if such a parser exists. We use the Coq Proof Assistant to verify that the generator and the parsers that it produces are sound and complete, and that they terminate on all inputs without using fuel parameters. As a case study, we extract the tool’s source code and use it to generate a JSON parser. The generated parser runs in linear time; it is two to four times slower than an unverified parser for the same grammar.

Subject Classification

ACM Subject Classification
  • Theory of computation → Grammars and context-free languages
  • Software and its engineering → Parsers
  • Software and its engineering → Formal software verification
Keywords
  • interactive theorem proving
  • top-down parsing

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Andrew W. Appel. Modern Compiler Implementation in ML. Cambridge University Press, 1998. Google Scholar
  2. Aditi Barthwal and Michael Norrish. Verified, executable parsing. In European Symposium on Programming, pages 160-174. Springer, 2009. Google Scholar
  3. Sandrine Blazy and Xavier Leroy. Mechanized semantics for the Clight subset of the C language. Journal of Automated Reasoning, 43(3):263-288, 2009. Google Scholar
  4. Bryan Ford. Parsing Expression Grammars: A Recognition-based Syntactic Foundation. In Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '04, pages 111-122, New York, NY, USA, 2004. ACM. URL: https://doi.org/10.1145/964001.964011.
  5. Dan Goodin. Failure to patch two-month-old bug led to massive Equifax breach. https://arstechnica.com/information-technology/2017/09/massive-equifax-breach-caused-by-failure-to-patch-two-month-old-bug/, 2017.
  6. cloudflare: Cloudflare Reverse Proxies are Dumping Uninitialized Memory. https://bugs.chromium.org/p/project-zero/issues/detail?id=1139, 2017.
  7. Dick Grune and Ceriel JH Jacobs. Parsing Techniques (Monographs in Computer Science). Springer-Verlag, 2006. Google Scholar
  8. Jacques-Henri Jourdan, François Pottier, and Xavier Leroy. Validating LR(1) parsers. In European Symposium on Programming, pages 397-416. Springer, 2012. Google Scholar
  9. Adam Koprowski and Henri Binsztok. TRX: A formally verified parser interpreter. In European Symposium on Programming, pages 345-365. Springer, 2010. Google Scholar
  10. Pierre Letouzey. Extraction in Coq: An overview. In Conference on Computability in Europe, pages 359-369. Springer, 2008. Google Scholar
  11. Robin Milner. A theory of type polymorphism in programming. Journal of computer and system sciences, 17(3):348-375, 1978. Google Scholar
  12. Yaron Minsky, Anil Madhavapeddy, and Jason Hickey. Real World OCaml: Functional programming for the masses. O'Reilly Media, Inc., 2013. Google Scholar
  13. CVE-2016-0101. National Vulnerability Database. https://nvd.nist.gov/vuln/detail/CVE-2016-0101, 2016.
  14. CVE-2017-5638. National Vulnerability Database. https://nvd.nist.gov/vuln/detail/CVE-2017-5638, 2017.
  15. Terence Parr and Kathleen Fisher. LL(*): The foundation of the ANTLR parser generator. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 425-436, June 2011. Google Scholar
  16. Terence Parr, Sam Harwell, and Kathleen Fisher. Adaptive LL(*) Parsing: The Power of Dynamic Analysis. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, volume 49, pages 579-598, October 2014. URL: https://doi.org/10.1145/2714064.2660202.
  17. François Pottier and Yann Régis-Gianas. Menhir reference manual. Inria, August 2016. Google Scholar
  18. Matthieu Sozeau. PROGRAM-ing finger trees in Coq. In ACM SIGPLAN International Conference on Functional Programming. Association for Computing Machinery, 2007. Google Scholar
  19. The Coq Proof Assistant, version 8.9.0, January 2019. URL: https://doi.org/10.5281/zenodo.2554024.
  20. Ryan Wisnesky, Gregory Michael Malecha, and John Gregory Morrisett. Certified web services in Ynot, 2010. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail