Perfect Is the Enemy of Good: Best-Effort Program Synthesis

Authors Hila Peleg , Nadia Polikarpova



PDF
Thumbnail PDF

File

LIPIcs.ECOOP.2020.2.pdf
  • Filesize: 0.71 MB
  • 30 pages

Document Identifiers

Author Details

Hila Peleg
  • University of California San Diego, CA, USA
Nadia Polikarpova
  • University of California San Diego, CA, USA

Cite As Get BibTex

Hila Peleg and Nadia Polikarpova. Perfect Is the Enemy of Good: Best-Effort Program Synthesis. In 34th European Conference on Object-Oriented Programming (ECOOP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 166, pp. 2:1-2:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.ECOOP.2020.2

Abstract

Program synthesis promises to help software developers with everyday tasks by generating code snippets automatically from input-output examples and other high-level specifications. The conventional wisdom is that a synthesizer must always satisfy the specification exactly. We conjecture that this all-or-nothing paradigm stands in the way of adopting program synthesis as a developer tool: in practice, the user-written specification often contains errors or is simply too hard for the synthesizer to solve within a reasonable time; in these cases, the user is left with a single over-fitted result or, more often than not, no result at all. In this paper we propose a new program synthesis paradigm we call best-effort program synthesis, where the synthesizer returns a ranked list of partially-valid results, i.e. programs that satisfy some part of the specification.
To support this paradigm, we develop best-effort enumeration, a new synthesis algorithm that extends a popular program enumeration technique with the ability to accumulate and return multiple partially-valid results with minimal overhead. We implement this algorithm in a tool called BESTER, and evaluate it on 79 synthesis benchmarks from the literature. Contrary to the conventional wisdom, our evaluation shows that BESTER returns useful results even when the specification is flawed or too hard: i) for all benchmarks with an error in the specification, the top three BESTER results contain the correct solution, and ii) for most hard benchmarks, the top three results contain non-trivial fragments of the correct solution. We also performed an exploratory user study, which confirms our intuition that partially-valid results are useful: the study shows that programmers use the output of the synthesizer for comprehension and often incorporate it into their solutions.

Subject Classification

ACM Subject Classification
  • Theory of computation → Program specifications
  • Software and its engineering → Automatic programming
Keywords
  • Program Synthesis
  • Programming by Example

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Raihan Al-Ekram, Archana Adma, and Olga Baysal. diffx: an algorithm to detect changes in multi-version xml documents. In Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research, pages 1-11. IBM Press, 2005. Google Scholar
  2. Aws Albarghouthi, Sumit Gulwani, and Zachary Kincaid. Recursive program synthesis. In International Conference on Computer Aided Verification, pages 934-950. Springer, 2013. Google Scholar
  3. Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo MK Martin, Mukund Raghothaman, Sanjit A Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. Syntax-guided synthesis. Dependable Software Systems Engineering, 40:1-25, 2015. Google Scholar
  4. Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. Sygus-comp 2017: Results and analysis. Google Scholar
  5. Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. Sygus-comp 2016: Results and analysis. arXiv preprint, 2016. URL: http://arxiv.org/abs/1611.07627.
  6. Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa. Scaling enumerative program synthesis via divide and conquer. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 319-336. Springer, 2017. Google Scholar
  7. Shengwei An, Rishabh Singh, Sasa Misailovic, and Roopsha Samanta. Augmented example-based synthesis using relational perturbation properties. Proceedings of the ACM on Programming Languages, 4(POPL):56, 2019. Google Scholar
  8. Pavol Bielik, Veselin Raychev, and Martin Vechev. Phog: Probabilistic model for code. In Maria Florina Balcan and Kilian Q. Weinberger, editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 2933-2942, New York, New York, USA, 20-22 june 2016. PMLR. URL: http://proceedings.mlr.press/v48/bielik16.html.
  9. Nikolaj Bjørner, Anh-Dung Phan, and Lars Fleckenstein. νz-an optimizing smt solver. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 194-199. Springer, 2015. Google Scholar
  10. Brian Borchers and Judith Furman. A two-phase exact algorithm for max-sat and weighted max-sat problems. Journal of Combinatorial Optimization, 2(4):299-306, 1998. Google Scholar
  11. Joel Brandt, Philip J Guo, Joel Lewenstein, Mira Dontcheva, and Scott R Klemmer. Two studies of opportunistic programming: interleaving web foraging, learning, and writing code. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1589-1598. ACM, 2009. Google Scholar
  12. Joel Brandt, Philip J Guo, Joel Lewenstein, and Scott R Klemmer. Opportunistic programming: How rapid ideation and prototyping occur in practice. In Proceedings of the 4th international workshop on End-user software engineering, pages 1-5. ACM, 2008. Google Scholar
  13. Sarah Chasins. Democratizing Web Automation: Programming for Social Scientists and Other Domain Experts. PhD thesis, EECS Department, University of California, Berkeley, October 2019. URL: http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-139.html.
  14. Yanju Chen, Ruben Martins, and Yu Feng. Maximal multi-layer specification synthesis. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 602-612, 2019. Google Scholar
  15. Loris D'Antoni, Roopsha Samanta, and Rishabh Singh. Qlose: Program repair with quantitative objectives. In Computer Aided Verification - 28th International Conference, CAV 2016, Toronto, ON, Canada, July 17-23, 2016, Proceedings, Part II, pages 383-401, 2016. URL: https://doi.org/10.1007/978-3-319-41540-6_21.
  16. Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, and Pushmeet Kohli. Robustfill: Neural program learning under noisy I/O. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, pages 990-998, 2017. Google Scholar
  17. Kevin Ellis, Daniel Ritchie, Armando Solar-Lezama, and Josh Tenenbaum. Learning to infer graphics programs from hand-drawn images. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 6059-6068. Curran Associates, Inc., 2018. URL: http://papers.nips.cc/paper/7845-learning-to-infer-graphics-programs-from-hand-drawn-images.pdf.
  18. Azadeh Farzan and Victor Nicolet. Modular divide-and-conquer parallelization of nested loops. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, pages 610-624, New York, NY, USA, 2019. ACM. URL: https://doi.org/10.1145/3314221.3314612.
  19. Yu Feng, Ruben Martins, Jacob Van Geffen, Isil Dillig, and Swarat Chaudhuri. Component-based synthesis of table consolidation and transformation tasks from examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, pages 422-436, 2017. Google Scholar
  20. Yu Feng, Ruben Martins, Yuepeng Wang, Isil Dillig, and Thomas W Reps. Component-based synthesis for complex apis. ACM SIGPLAN Notices, 52(1):599-612, 2017. Google Scholar
  21. John K Feser, Swarat Chaudhuri, and Isil Dillig. Synthesizing data structure transformations from input-output examples. In ACM SIGPLAN Notices, volume 50(6), pages 229-239. ACM, 2015. Google Scholar
  22. Jonathan Frankle, Peter-Michael Osera, David Walker, and Steve Zdancewic. Example-directed synthesis: A type-theoretic interpretation. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '16, pages 802-815, New York, NY, USA, 2016. ACM. URL: https://doi.org/10.1145/2837614.2837629.
  23. Sumit Gulwani. Automating string processing in spreadsheets using input-output examples. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '11, pages 317-330, New York, NY, USA, 2011. ACM. URL: https://doi.org/10.1145/1926385.1926423.
  24. Sumit Gulwani. Synthesis from examples: Interaction models and algorithms. In Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2012 14th International Symposium on, pages 8-14. IEEE, 2012. Google Scholar
  25. Sumit Gulwani. Programming by examples (and its applications in data wrangling). In Javier Esparza, Orna Grumberg, and Salomon Sickert, editors, Verification and Synthesis of Correct and Secure Systems. IOS Press, 2016. Google Scholar
  26. Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. Synthesis of loop-free programs. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011, pages 62-73, 2011. URL: https://doi.org/10.1145/1993498.1993506.
  27. Zheng Guo, Michael James, David Justo, Jiaxiao Zhou, Ziteng Wang, Ranjit Jhala, and Nadia Polikarpova. Program synthesis by type-guided abstraction refinement. In Principles of programming languages, page to appear, 2020. Google Scholar
  28. Tihomir Gvero, Viktor Kuncak, Ivan Kuraj, and Ruzica Piskac. Complete completion using types and weights. In ACM SIGPLAN Notices, volume 48(6), pages 27-38. ACM, 2013. Google Scholar
  29. Jeevana Priya Inala and Rishabh Singh. Webrelate: integrating web data with spreadsheets using examples. PACMPL, 2(POPL):2:1-2:28, 2018. URL: https://doi.org/10.1145/3158090.
  30. Shachar Itzhaky, Rohit Singh, Armando Solar-Lezama, Kuat Yessenov, Yongquan Lu, Charles Leiserson, and Rezaul Chowdhury. Deriving divide-and-conquer dynamic programming algorithms using solver-aided transformations. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 145-164. ACM, 2016. Google Scholar
  31. Vu Le and Sumit Gulwani. FlashExtract: a framework for data extraction by examples. In Michael F. P. O'Boyle and Keshav Pingali, editors, Proceedings of the 35th Conference on Programming Language Design and Implementation, page 55. ACM, 2014. URL: https://doi.org/10.1145/2594291.2594333.
  32. Vu Le, Daniel Perelman, Oleksandr Polozov, Mohammad Raza, Abhishek Udupa, and Sumit Gulwani. Interactive program synthesis. CoRR, abs/1703.03539, 2017. URL: http://arxiv.org/abs/1703.03539.
  33. Woosuk Lee, Kihong Heo, Rajeev Alur, and Mayur Naik. Accelerating search-based program synthesis using learned probabilistic models. In ACM SIGPLAN Notices, volume 53(4), pages 436-449. ACM, 2018. Google Scholar
  34. Vladimir I Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, volume 10(8), pages 707-710, 1966. Google Scholar
  35. Anders Miltner, Solomon Maina, Kathleen Fisher, Benjamin C. Pierce, David Walker, and Steve Zdancewic. Synthesizing symmetric lenses. Proc. ACM Program. Lang., 3(ICFP), July 2019. URL: https://doi.org/10.1145/3341699.
  36. Peter-Michael Osera and Steve Zdancewic. Type-and-example-directed program synthesis. In ACM SIGPLAN Notices, volume 50(6), pages 619-630. ACM, 2015. Google Scholar
  37. Antti Oulasvirta and Pertti Saariluoma. Surviving task interruptions: Investigating the implications of long-term working memory theory. International Journal of Human-Computer Studies, 64(10):941-961, 2006. Google Scholar
  38. Saswat Padhi, Todd D. Millstein, Aditya V. Nori, and Rahul Sharma. Overfitting in synthesis: Theory and practice. In Computer Aided Verification - 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I, pages 315-334, 2019. URL: https://doi.org/10.1007/978-3-030-25540-4_17.
  39. Hila Peleg, Shachar Itzhaky, and Sharon Shoham. Abstraction-based interaction model for synthesis. In Isil Dillig and Jens Palsberg, editors, Verification, Model Checking, and Abstract Interpretation, pages 382-405, Cham, 2018. Springer International Publishing. Google Scholar
  40. Hila Peleg, Sharon Shoham, and Eran Yahav. Programming not only by example. In Proceedings of the 40th International Conference on Software Engineering, pages 1114-1124. ACM, 2018. Google Scholar
  41. Phitchaya Mangpo Phothilimthana, Aditya Thakur, Rastislav Bodik, and Dinakar Dhurjati. Scaling up superoptimization. In ACM SIGARCH Computer Architecture News, volume 44(2), pages 297-310. ACM, 2016. Google Scholar
  42. Oleksandr Polozov and Sumit Gulwani. Flashmeta: A framework for inductive program synthesis. ACM SIGPLAN Notices, 50(10):107-126, 2015. Google Scholar
  43. Veselin Raychev, Martin Vechev, and Eran Yahav. Code completion with statistical language models. In ACM SIGPLAN Notices, volume 49(6), pages 419-428. ACM, 2014. Google Scholar
  44. Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark W. Barrett, and Cesare Tinelli. cvc4sy: Smart and fast term enumeration for syntax-guided synthesis. In Computer Aided Verification - 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part II, pages 74-83, 2019. Google Scholar
  45. Andrew Reynolds, Morgan Deters, Viktor Kuncak, Cesare Tinelli, and Clark Barrett. Counterexample-guided quantifier instantiation for synthesis in smt. In International Conference on Computer Aided Verification, pages 198-216. Springer, 2015. Google Scholar
  46. Kensen Shi, Jacob Steinhardt, and Percy Liang. Frangel: Component-based synthesis with control structures. Proc. ACM Program. Lang., 3(POPL), January 2019. URL: https://doi.org/10.1145/3290386.
  47. Rishabh Singh and Sumit Gulwani. Predicting a correct program in programming by example. In Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part I, pages 398-414, 2015. URL: https://doi.org/10.1007/978-3-319-21690-4_23.
  48. Rohit Singh, Venkata Vamsikrishna Meduri, Ahmed K. Elmagarmid, Samuel Madden, Paolo Papotti, Jorge-Arnulfo Quiané-Ruiz, Armando Solar-Lezama, and Nan Tang. Synthesizing entity matching rules by examples. PVLDB, 11(2):189-202, 2017. URL: http://www.vldb.org/pvldb/vol11/p189-singh.pdf.
  49. Calvin Smith and Aws Albarghouthi. Program synthesis with equivalence reduction. In Verification, Model Checking, and Abstract Interpretation - 20th International Conference, VMCAI 2019, Cascais, Portugal, January 13-15, 2019, Proceedings, pages 24-47, 2019. URL: https://doi.org/10.1007/978-3-030-11245-5_2.
  50. Armando Solar-Lezama. Program sketching. STTT, 15(5-6):475-495, 2013. URL: https://doi.org/10.1007/s10009-012-0249-7.
  51. Armando Solar-Lezama, Christopher Grant Jones, and Rastislav Bodik. Sketching concurrent data structures. In ACM SIGPLAN Notices, volume 43(6), pages 136-148. ACM, 2008. Google Scholar
  52. Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Sanjit Seshia, and Vijay Saraswat. Combinatorial sketching for finite programs. ACM SIGOPS Operating Systems Review, 40(5):404-415, 2006. Google Scholar
  53. Yonglong Tian, Andrew Luo, Xingyuan Sun, Kevin Ellis, William T. Freeman, Joshua B. Tenenbaum, and Jiajun Wu. Learning to infer and execute 3d shape programs. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019. Google Scholar
  54. Emina Torlak and Rastislav Bodík. A lightweight symbolic virtual machine for solver-aided host languages. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, Edinburgh, United Kingdom - June 09 - 11, 2014, page 54, 2014. URL: https://doi.org/10.1145/2594291.2594340.
  55. Abhishek Udupa, Arun Raghavan, Jyotirmoy V Deshmukh, Sela Mador-Haim, Milo MK Martin, and Rajeev Alur. Transit: specifying protocols with concolic snippets. ACM SIGPLAN Notices, 48(6):287-296, 2013. Google Scholar
  56. Chenglong Wang, Alvin Cheung, and Rastislav Bodik. Synthesizing highly expressive sql queries from input-output examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 452-466. ACM, 2017. Google Scholar
  57. Ke Wang, Rishabh Singh, and Zhendong Su. Search, align, and repair: data-driven feedback generation for introductory programming exercises. In ACM SIGPLAN Notices, volume 53(4), pages 481-495. ACM, 2018. Google Scholar
  58. Navid Yaghmazadeh, Xinyu Wang, and Isil Dillig. Automated migration of hierarchical data to relational tables using programming-by-example. Proc. VLDB Endow., 11(5):580-593, January 2018. URL: https://doi.org/10.1145/3187009.3177735.
  59. Tianyi Zhang, Di Yang, Crista Lopes, and Miryung Kirnt. Analyzing and supporting adaptation of online code examples. In Proceedings of the 41st International Conference on Software Engineering, pages 316-327. IEEE Press, 2019. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail