Perfect Is the Enemy of Good: Best-Effort Program Synthesis

Peleg, Hila; Polikarpova, Nadia

doi:10.4230/LIPIcs.ECOOP.2020.2

Abstract

Program synthesis promises to help software developers with everyday tasks by generating code snippets automatically from input-output examples and other high-level specifications. The conventional wisdom is that a synthesizer must always satisfy the specification exactly. We conjecture that this all-or-nothing paradigm stands in the way of adopting program synthesis as a developer tool: in practice, the user-written specification often contains errors or is simply too hard for the synthesizer to solve within a reasonable time; in these cases, the user is left with a single over-fitted result or, more often than not, no result at all. In this paper we propose a new program synthesis paradigm we call best-effort program synthesis, where the synthesizer returns a ranked list of partially-valid results, i.e. programs that satisfy some part of the specification. To support this paradigm, we develop best-effort enumeration, a new synthesis algorithm that extends a popular program enumeration technique with the ability to accumulate and return multiple partially-valid results with minimal overhead. We implement this algorithm in a tool called BESTER, and evaluate it on 79 synthesis benchmarks from the literature. Contrary to the conventional wisdom, our evaluation shows that BESTER returns useful results even when the specification is flawed or too hard: i) for all benchmarks with an error in the specification, the top three BESTER results contain the correct solution, and ii) for most hard benchmarks, the top three results contain non-trivial fragments of the correct solution. We also performed an exploratory user study, which confirms our intuition that partially-valid results are useful: the study shows that programmers use the output of the synthesizer for comprehension and often incorporate it into their solutions.

Raihan Al-Ekram, Archana Adma, and Olga Baysal. diffx: an algorithm to detect changes in multi-version xml documents. In Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research, pages 1-11. IBM Press, 2005.
Aws Albarghouthi, Sumit Gulwani, and Zachary Kincaid. Recursive program synthesis. In International Conference on Computer Aided Verification, pages 934-950. Springer, 2013.
Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo MK Martin, Mukund Raghothaman, Sanjit A Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. Syntax-guided synthesis. Dependable Software Systems Engineering, 40:1-25, 2015.
Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. Sygus-comp 2017: Results and analysis.
Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. Sygus-comp 2016: Results and analysis. arXiv preprint, 2016. URL: http://arxiv.org/abs/1611.07627.
Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa. Scaling enumerative program synthesis via divide and conquer. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 319-336. Springer, 2017.
Shengwei An, Rishabh Singh, Sasa Misailovic, and Roopsha Samanta. Augmented example-based synthesis using relational perturbation properties. Proceedings of the ACM on Programming Languages, 4(POPL):56, 2019.
Pavol Bielik, Veselin Raychev, and Martin Vechev. Phog: Probabilistic model for code. In Maria Florina Balcan and Kilian Q. Weinberger, editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 2933-2942, New York, New York, USA, 20-22 june 2016. PMLR. URL: http://proceedings.mlr.press/v48/bielik16.html.
Nikolaj Bjørner, Anh-Dung Phan, and Lars Fleckenstein. νz-an optimizing smt solver. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 194-199. Springer, 2015.
Brian Borchers and Judith Furman. A two-phase exact algorithm for max-sat and weighted max-sat problems. Journal of Combinatorial Optimization, 2(4):299-306, 1998.
Joel Brandt, Philip J Guo, Joel Lewenstein, Mira Dontcheva, and Scott R Klemmer. Two studies of opportunistic programming: interleaving web foraging, learning, and writing code. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1589-1598. ACM, 2009.
Joel Brandt, Philip J Guo, Joel Lewenstein, and Scott R Klemmer. Opportunistic programming: How rapid ideation and prototyping occur in practice. In Proceedings of the 4th international workshop on End-user software engineering, pages 1-5. ACM, 2008.
Sarah Chasins. Democratizing Web Automation: Programming for Social Scientists and Other Domain Experts. PhD thesis, EECS Department, University of California, Berkeley, October 2019. URL: http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-139.html.
Yanju Chen, Ruben Martins, and Yu Feng. Maximal multi-layer specification synthesis. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 602-612, 2019.
Loris D'Antoni, Roopsha Samanta, and Rishabh Singh. Qlose: Program repair with quantitative objectives. In Computer Aided Verification - 28th International Conference, CAV 2016, Toronto, ON, Canada, July 17-23, 2016, Proceedings, Part II, pages 383-401, 2016. URL: https://doi.org/10.1007/978-3-319-41540-6_21.
Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, and Pushmeet Kohli. Robustfill: Neural program learning under noisy I/O. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, pages 990-998, 2017.
Kevin Ellis, Daniel Ritchie, Armando Solar-Lezama, and Josh Tenenbaum. Learning to infer graphics programs from hand-drawn images. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 6059-6068. Curran Associates, Inc., 2018. URL: http://papers.nips.cc/paper/7845-learning-to-infer-graphics-programs-from-hand-drawn-images.pdf.
Azadeh Farzan and Victor Nicolet. Modular divide-and-conquer parallelization of nested loops. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, pages 610-624, New York, NY, USA, 2019. ACM. URL: https://doi.org/10.1145/3314221.3314612.
Yu Feng, Ruben Martins, Jacob Van Geffen, Isil Dillig, and Swarat Chaudhuri. Component-based synthesis of table consolidation and transformation tasks from examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, pages 422-436, 2017.
Yu Feng, Ruben Martins, Yuepeng Wang, Isil Dillig, and Thomas W Reps. Component-based synthesis for complex apis. ACM SIGPLAN Notices, 52(1):599-612, 2017.
John K Feser, Swarat Chaudhuri, and Isil Dillig. Synthesizing data structure transformations from input-output examples. In ACM SIGPLAN Notices, volume 50(6), pages 229-239. ACM, 2015.
Jonathan Frankle, Peter-Michael Osera, David Walker, and Steve Zdancewic. Example-directed synthesis: A type-theoretic interpretation. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '16, pages 802-815, New York, NY, USA, 2016. ACM. URL: https://doi.org/10.1145/2837614.2837629.
Sumit Gulwani. Automating string processing in spreadsheets using input-output examples. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '11, pages 317-330, New York, NY, USA, 2011. ACM. URL: https://doi.org/10.1145/1926385.1926423.
Sumit Gulwani. Synthesis from examples: Interaction models and algorithms. In Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2012 14th International Symposium on, pages 8-14. IEEE, 2012.
Sumit Gulwani. Programming by examples (and its applications in data wrangling). In Javier Esparza, Orna Grumberg, and Salomon Sickert, editors, Verification and Synthesis of Correct and Secure Systems. IOS Press, 2016.
Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. Synthesis of loop-free programs. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011, pages 62-73, 2011. URL: https://doi.org/10.1145/1993498.1993506.
Zheng Guo, Michael James, David Justo, Jiaxiao Zhou, Ziteng Wang, Ranjit Jhala, and Nadia Polikarpova. Program synthesis by type-guided abstraction refinement. In Principles of programming languages, page to appear, 2020.
Tihomir Gvero, Viktor Kuncak, Ivan Kuraj, and Ruzica Piskac. Complete completion using types and weights. In ACM SIGPLAN Notices, volume 48(6), pages 27-38. ACM, 2013.
Jeevana Priya Inala and Rishabh Singh. Webrelate: integrating web data with spreadsheets using examples. PACMPL, 2(POPL):2:1-2:28, 2018. URL: https://doi.org/10.1145/3158090.
Shachar Itzhaky, Rohit Singh, Armando Solar-Lezama, Kuat Yessenov, Yongquan Lu, Charles Leiserson, and Rezaul Chowdhury. Deriving divide-and-conquer dynamic programming algorithms using solver-aided transformations. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 145-164. ACM, 2016.
Vu Le and Sumit Gulwani. FlashExtract: a framework for data extraction by examples. In Michael F. P. O'Boyle and Keshav Pingali, editors, Proceedings of the 35th Conference on Programming Language Design and Implementation, page 55. ACM, 2014. URL: https://doi.org/10.1145/2594291.2594333.
Vu Le, Daniel Perelman, Oleksandr Polozov, Mohammad Raza, Abhishek Udupa, and Sumit Gulwani. Interactive program synthesis. CoRR, abs/1703.03539, 2017. URL: http://arxiv.org/abs/1703.03539.
Woosuk Lee, Kihong Heo, Rajeev Alur, and Mayur Naik. Accelerating search-based program synthesis using learned probabilistic models. In ACM SIGPLAN Notices, volume 53(4), pages 436-449. ACM, 2018.
Vladimir I Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, volume 10(8), pages 707-710, 1966.
Anders Miltner, Solomon Maina, Kathleen Fisher, Benjamin C. Pierce, David Walker, and Steve Zdancewic. Synthesizing symmetric lenses. Proc. ACM Program. Lang., 3(ICFP), July 2019. URL: https://doi.org/10.1145/3341699.
Peter-Michael Osera and Steve Zdancewic. Type-and-example-directed program synthesis. In ACM SIGPLAN Notices, volume 50(6), pages 619-630. ACM, 2015.
Antti Oulasvirta and Pertti Saariluoma. Surviving task interruptions: Investigating the implications of long-term working memory theory. International Journal of Human-Computer Studies, 64(10):941-961, 2006.
Saswat Padhi, Todd D. Millstein, Aditya V. Nori, and Rahul Sharma. Overfitting in synthesis: Theory and practice. In Computer Aided Verification - 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I, pages 315-334, 2019. URL: https://doi.org/10.1007/978-3-030-25540-4_17.
Hila Peleg, Shachar Itzhaky, and Sharon Shoham. Abstraction-based interaction model for synthesis. In Isil Dillig and Jens Palsberg, editors, Verification, Model Checking, and Abstract Interpretation, pages 382-405, Cham, 2018. Springer International Publishing.
Hila Peleg, Sharon Shoham, and Eran Yahav. Programming not only by example. In Proceedings of the 40th International Conference on Software Engineering, pages 1114-1124. ACM, 2018.
Phitchaya Mangpo Phothilimthana, Aditya Thakur, Rastislav Bodik, and Dinakar Dhurjati. Scaling up superoptimization. In ACM SIGARCH Computer Architecture News, volume 44(2), pages 297-310. ACM, 2016.
Oleksandr Polozov and Sumit Gulwani. Flashmeta: A framework for inductive program synthesis. ACM SIGPLAN Notices, 50(10):107-126, 2015.
Veselin Raychev, Martin Vechev, and Eran Yahav. Code completion with statistical language models. In ACM SIGPLAN Notices, volume 49(6), pages 419-428. ACM, 2014.
Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark W. Barrett, and Cesare Tinelli. cvc4sy: Smart and fast term enumeration for syntax-guided synthesis. In Computer Aided Verification - 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part II, pages 74-83, 2019.
Andrew Reynolds, Morgan Deters, Viktor Kuncak, Cesare Tinelli, and Clark Barrett. Counterexample-guided quantifier instantiation for synthesis in smt. In International Conference on Computer Aided Verification, pages 198-216. Springer, 2015.
Kensen Shi, Jacob Steinhardt, and Percy Liang. Frangel: Component-based synthesis with control structures. Proc. ACM Program. Lang., 3(POPL), January 2019. URL: https://doi.org/10.1145/3290386.
Rishabh Singh and Sumit Gulwani. Predicting a correct program in programming by example. In Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part I, pages 398-414, 2015. URL: https://doi.org/10.1007/978-3-319-21690-4_23.
Rohit Singh, Venkata Vamsikrishna Meduri, Ahmed K. Elmagarmid, Samuel Madden, Paolo Papotti, Jorge-Arnulfo Quiané-Ruiz, Armando Solar-Lezama, and Nan Tang. Synthesizing entity matching rules by examples. PVLDB, 11(2):189-202, 2017. URL: http://www.vldb.org/pvldb/vol11/p189-singh.pdf.
Calvin Smith and Aws Albarghouthi. Program synthesis with equivalence reduction. In Verification, Model Checking, and Abstract Interpretation - 20th International Conference, VMCAI 2019, Cascais, Portugal, January 13-15, 2019, Proceedings, pages 24-47, 2019. URL: https://doi.org/10.1007/978-3-030-11245-5_2.
Armando Solar-Lezama. Program sketching. STTT, 15(5-6):475-495, 2013. URL: https://doi.org/10.1007/s10009-012-0249-7.
Armando Solar-Lezama, Christopher Grant Jones, and Rastislav Bodik. Sketching concurrent data structures. In ACM SIGPLAN Notices, volume 43(6), pages 136-148. ACM, 2008.
Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Sanjit Seshia, and Vijay Saraswat. Combinatorial sketching for finite programs. ACM SIGOPS Operating Systems Review, 40(5):404-415, 2006.
Yonglong Tian, Andrew Luo, Xingyuan Sun, Kevin Ellis, William T. Freeman, Joshua B. Tenenbaum, and Jiajun Wu. Learning to infer and execute 3d shape programs. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
Emina Torlak and Rastislav Bodík. A lightweight symbolic virtual machine for solver-aided host languages. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, Edinburgh, United Kingdom - June 09 - 11, 2014, page 54, 2014. URL: https://doi.org/10.1145/2594291.2594340.
Abhishek Udupa, Arun Raghavan, Jyotirmoy V Deshmukh, Sela Mador-Haim, Milo MK Martin, and Rajeev Alur. Transit: specifying protocols with concolic snippets. ACM SIGPLAN Notices, 48(6):287-296, 2013.
Chenglong Wang, Alvin Cheung, and Rastislav Bodik. Synthesizing highly expressive sql queries from input-output examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 452-466. ACM, 2017.
Ke Wang, Rishabh Singh, and Zhendong Su. Search, align, and repair: data-driven feedback generation for introductory programming exercises. In ACM SIGPLAN Notices, volume 53(4), pages 481-495. ACM, 2018.
Navid Yaghmazadeh, Xinyu Wang, and Isil Dillig. Automated migration of hierarchical data to relational tables using programming-by-example. Proc. VLDB Endow., 11(5):580-593, January 2018. URL: https://doi.org/10.1145/3187009.3177735.
Tianyi Zhang, Di Yang, Crista Lopes, and Miryung Kirnt. Analyzing and supporting adaptation of online code examples. In Proceedings of the 41st International Conference on Software Engineering, pages 316-327. IEEE Press, 2019.

Perfect Is the Enemy of Good: Best-Effort Program Synthesis

Authors Hila Peleg , Nadia Polikarpova

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Perfect Is the Enemy of Good: Best-Effort Program Synthesis

Authors Hila Peleg , Nadia Polikarpova

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Supplementary Materials

References

Thanks for your feedback!

Could not send message