Don't Panic! Better, Fewer, Syntax Errors for LR Parsers

Authors Lukas Diekmann, Laurence Tratt



PDF
Thumbnail PDF

File

LIPIcs.ECOOP.2020.6.pdf
  • Filesize: 1.06 MB
  • 32 pages

Document Identifiers

Author Details

Lukas Diekmann
  • Software Development Team, King’s College London, United Kingdom
Laurence Tratt
  • Software Development Team, King’s College London, United Kingdom

Acknowledgements

We are grateful to the Blackbox developers for allowing us access to their data, and particularly to Neil Brown for help in extracting a relevant corpus. We thank Edd Barrett for helping to set up our benchmarking machine and for comments on the paper. We also thank Carl Friedrich Bolz-Tereick, Sérgio Queiroz de Medeiros, Sarah Mount, François Pottier, Christian Urban, and Naveneetha Vasudevan for comments.

Cite AsGet BibTex

Lukas Diekmann and Laurence Tratt. Don't Panic! Better, Fewer, Syntax Errors for LR Parsers. In 34th European Conference on Object-Oriented Programming (ECOOP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 166, pp. 6:1-6:32, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.ECOOP.2020.6

Abstract

Syntax errors are generally easy to fix for humans, but not for parsers in general nor LR parsers in particular. Traditional "panic mode" error recovery, though easy to implement and applicable to any grammar, often leads to a cascading chain of errors that drown out the original. More advanced error recovery techniques suffer less from this problem but have seen little practical use because their typical performance was seen as poor, their worst case unbounded, and the repairs they reported arbitrary. In this paper we introduce the CPCT+ algorithm, and an implementation of that algorithm, that address these issues. First, CPCT+ reports the complete set of minimum cost repair sequences for a given location, allowing programmers to select the one that best fits their intention. Second, on a corpus of 200,000 real-world syntactically invalid Java programs, CPCT+ is able to repair 98.37%±0.017% of files within a timeout of 0.5s. Finally, CPCT+ uses the complete set of minimum cost repair sequences to reduce the cascading error problem, where incorrect error recovery causes further spurious syntax errors to be identified. Across the test corpus, CPCT+ reports 435,812±473 error locations to the user, reducing the cascading error problem substantially relative to the 981,628±0 error locations reported by panic mode.

Subject Classification

ACM Subject Classification
  • Theory of computation → Parsing
  • Software and its engineering → Compilers
Keywords
  • Parsing
  • error recovery
  • programming languages

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Alfred Aho and Thomas G. Peterson. A minimum distance error-correcting parser for context-free languages. J. Comput., 1:305-312, December 1972. Google Scholar
  2. Eberhard Bertsch and Mark-Jan Nederhof. On failure of the pruning technique in "error repair in shift-reduce parsers". TOPLAS, 21(1):1-10, January 1999. Google Scholar
  3. Neil C. C. Brown, Michael Kölling, Davin McCall, and Ian Utting. Blackbox: A large scale repository of novice programmers activity. In SIGCSE, March 2014. Google Scholar
  4. Carl Cerecke. Locally least-cost error repair in LR parsers. PhD thesis, University of Canterbury, June 2003. Google Scholar
  5. Rafael Corchuelo, José Antonio Pérez, Antonio Ruiz-Cortés, and Miguel Toro. Repairing syntax errors in LR parsers. TOPLAS, 24:698-710, November 2002. Google Scholar
  6. Maartje de Jonge, Lennart C. L. Kats, Eelco Visser, and Emma Söderberg. Natural and flexible error recovery for generated modular language environments. TOPLAS, 34(4):15:1-15:50, December 2012. Google Scholar
  7. Maartje de Jonge and Eelco Visser. Automated evaluation of syntax error recovery. In ASE, pages 322-325, September 2012. Google Scholar
  8. Pierpaolo Degano and Corrado Priami. Comparison of syntactic error handling in LR parsers. SPE, 25(6):657-679, June 1995. Google Scholar
  9. Joel E. Denny and Brian A. Malloy. The IELR(1) algorithm for generating minimal LR(1) parser tables for non-LR(1) grammars with conflict resolution. SCICO, 75(11):943-979, November 2010. Google Scholar
  10. Bradley Efron. Bootstrap methods: Another look at the jackknife. Ann. Statist., 7(1):1-26, January 1979. Google Scholar
  11. C. N. Fischer, B. A. Dion, and J. Mauney. A locally least-cost LR-error corrector. Technical Report 363, University of Wisconsin, August 1979. Google Scholar
  12. Carlos Gómez-Rodríguez, Miguel A. Alonso, and Manuel Vilares. Error-repair parsing schemata. TCS, 411(7):1121-1139, February 2010. Google Scholar
  13. Allen I. Holub. Compiler Design in C. Prentice-Hall, 1990. Google Scholar
  14. S. C. Johnson. YACC: Yet Another Compiler-Compiler. Technical Report Comp. Sci. 32, Bell Labs, July 1975. Google Scholar
  15. Ik-Soon Kim and Kwangkeun Yi. LR error repair using the A* algorithm. Acta Inf., 47:179-207, May 2010. Google Scholar
  16. Donald Knuth. On the translation of languages from left to right. Information and Control, 8(6):607-639, December 1965. Google Scholar
  17. Bruce J. McKenzie, Corey Yeatman, and Lorraine de Vere. Error repair in shift-reduce parsers. TOPLAS, 17(4):672-689, July 1995. Google Scholar
  18. Sérgio Medeiros, Gilney de Azevedo Alvez Junior, and Fabio Mascarenhas. Syntax error recovery in parsing expression grammars. CoRR, May 2019. URL: http://arxiv.org/abs/1905.02145.
  19. Sérgio Medeiros and Fabio Mascarenhas. Syntax error recovery in parsing expression grammars. In SAC, pages 1195-1202, April 2018. Google Scholar
  20. Marie-Hélène Nienaltowski, Michela Pedroni, and Bertrand Meyer. Compiler error messages: What can help novices? In SIGCSE, pages 168-172, March 2008. Google Scholar
  21. David Pager. A practical general method for constructing LR(k) parsers. Acta Inf., 7(3):249-268, September 1977. Google Scholar
  22. Terence Parr. The definitive ANTLR 4 reference. Pragmatic Bookshelf, January 2013. Google Scholar
  23. François Pottier. Reachability and error diagnosis in LR(1) parsers. In CC, pages 88-98, March 2016. Google Scholar
  24. James Prather, Raymond Pettit, Kayla Holcomb McMurry, Alani Peters, John Homer, Nevan Simone, and Maxine Cohen. On novices' interaction with compiler error messages: a human factors approach. In ICER, pages 74-82, August 2017. Google Scholar
  25. Sanguthevar Rajasekaran and Marius Nicolae. An error correcting parser for context free grammars that takes less than cubic time. In LATA, February 2016. Google Scholar
  26. Helmut Richter. Noncorrecting syntax error recovery. TOPLAS, 7(3):478-489, July 1985. Google Scholar
  27. Eddie Antonio Santos, Joshua Charles Campbell, Dhvani Patel, Abram Hindle, and José Nelson Amaral. Syntax and sensibility: Using language models to detect and correct syntax errors. In SANER, pages 1-11, March 2018. Google Scholar
  28. Masaru Tomita. An efficient augmented-context-free parsing algorithm. Comput. Linguist., 13(1-2):31-46, January 1987. Google Scholar
  29. D. A. Turner. Error diagnosis and recovery in one pass compilers. IPL, 6(4):113-115, August 1977. Google Scholar
  30. Arthur van Deudekom, Dick Grune, and Peter Kooiman. Initial experience with noncorrecting syntax error handling. Technical Report IR-339, Vrije Universiteit, Amsterdam, November 1993. Google Scholar
  31. Tim A. Wagner. Practical Algorithms for Incremental Software Development Environments. PhD thesis, University of California, Berkeley, March 1998. Google Scholar
  32. Vadim Zaytsev. Formal foundations for semi-parsing. In SMR, pages 313-317, February 2014. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail