Functional Programming with Datalog

Authors André Pacak, Sebastian Erdweg



PDF
Thumbnail PDF

File

LIPIcs.ECOOP.2022.7.pdf
  • Filesize: 0.83 MB
  • 28 pages

Document Identifiers

Author Details

André Pacak
  • JGU Mainz, Germany
Sebastian Erdweg
  • JGU Mainz, Germany

Cite AsGet BibTex

André Pacak and Sebastian Erdweg. Functional Programming with Datalog. In 36th European Conference on Object-Oriented Programming (ECOOP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 222, pp. 7:1-7:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.ECOOP.2022.7

Abstract

Datalog is a carefully restricted logic programming language. What makes Datalog attractive is its declarative fixpoint semantics: Datalog queries consist of simple Horn clauses, yet Datalog solvers efficiently compute all derivable tuples even for recursive queries. However, as we argue in this paper, Datalog is ill-suited as a programming language and Datalog programs are hard to write and maintain. We propose a "new" frontend for Datalog: functional programming with sets called functional IncA. While programmers write recursive functions over algebraic data types and sets, we transparently translate all code to Datalog relations. However, we retain Datalog’s strengths: Functions that generate sets can encode arbitrary relations and mutually recursive functions have fixpoint semantics. We also ensure that the generated Datalog program terminates whenever the original functional program terminates, so that we can apply off-the-shelve bottom-up Datalog solvers. We demonstrate the versatility and ease of use of functional IncA by implementing a type checker, a program transformation, an interpreter of the untyped lambda calculus, two data-flow analyses, and clone detection of Java bytecode.

Subject Classification

ACM Subject Classification
  • Software and its engineering → Software notations and tools
Keywords
  • Datalog
  • functional programming
  • demand transformation

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Serge Abiteboul, Zoë Abrams, Stefan Haar, and Tova Milo. Diagnosis of asynchronous discrete event systems: datalog to the rescue! In Chen Li, editor, Proceedings of the Twenty-fourth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 13-15, 2005, Baltimore, Maryland, USA, pages 358-367. ACM, 2005. URL: https://doi.org/10.1145/1065167.1065214.
  2. Peter Alvaro, Tyson Condie, Neil Conway, Khaled Elmeleegy, Joseph M. Hellerstein, and Russell Sears. Boom analytics: exploring data-centric, declarative programming for the cloud. In Christine Morin and Gilles Muller, editors, European Conference on Computer Systems, Proceedings of the 5th European conference on Computer systems, EuroSys 2010, Paris, France, April 13-16, 2010, pages 223-236. ACM, 2010. URL: https://doi.org/10.1145/1755913.1755937.
  3. Peter Alvaro, Neil Conway, Joseph M. Hellerstein, and William R. Marczak. Consistency analysis in bloom: a CALM and collected approach. In CIDR 2011, Fifth Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 9-12, 2011, Online Proceedings, pages 249-260. www.cidrdb.org, 2011. URL: http://cidrdb.org/cidr2011/Papers/CIDR11_Paper35.pdf.
  4. Peter Alvaro, William R. Marczak, Neil Conway, Joseph M. Hellerstein, David Maier, and Russell Sears. Dedalus: Datalog in time and space. In Oege de Moor, Georg Gottlob, Tim Furche, and Andrew Jon Sellers, editors, Datalog Reloaded - First International Workshop, Datalog 2010, Oxford, UK, March 16-19, 2010. Revised Selected Papers, volume 6702 of Lecture Notes in Computer Science, pages 262-281. Springer, 2010. URL: https://doi.org/10.1007/978-3-642-24206-9_16.
  5. Michael Arntzenius and Neel Krishnaswami. Seminaïve evaluation for a higher-order functional language. Proc. ACM Program. Lang., 4(POPL):22:1-22:28, 2020. URL: https://doi.org/10.1145/3371090.
  6. Michael Arntzenius and Neelakantan R. Krishnaswami. Datafun: A functional Datalog. In Jacques Garrigue, Gabriele Keller, and Eijiro Sumii, editors, Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016, pages 214-227. ACM, 2016. URL: https://doi.org/10.1145/2951913.2951948.
  7. Pavel Avgustinov, Oege de Moor, Michael Peyton Jones, and Max Schäfer. QL: object-oriented queries on relational data. In Shriram Krishnamurthi and Benjamin S. Lerner, editors, 30th European Conference on Object-Oriented Programming, ECOOP 2016, July 18-22, 2016, Rome, Italy, volume 56 of LIPIcs, pages 2:1-2:25. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. URL: https://doi.org/10.4230/LIPIcs.ECOOP.2016.2.
  8. Catriel Beeri and Raghu Ramakrishnan. On the power of magic. The Journal of Logic Programming, 10(3):255-299, 1991. Special Issue: Database Logic Progamming. URL: https://doi.org/10.1016/0743-1066(91)90038-Q.
  9. Aaron Bembenek, Michael Greenberg, and Stephen Chong. Formulog: Datalog for SMT-based static analysis. Proc. ACM Program. Lang., 4(OOPSLA):141:1-141:31, 2020. URL: https://doi.org/10.1145/3428209.
  10. Martin Bravenboer and Yannis Smaragdakis. Strictly declarative specification of sophisticated points-to analyses. In Shail Arora and Gary T. Leavens, editors, Proceedings of the 24th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2009, October 25-29, 2009, Orlando, Florida, USA, pages 243-262. ACM, 2009. URL: https://doi.org/10.1145/1640089.1640108.
  11. Sebastian Erdweg, Tamás Szabó, and André Pacak and. Concise, type-safe, and efficient structural diffing. In Programming Language Design and Implementation (PLDI). ACM, 2021. Google Scholar
  12. Shan Shan Huang, Todd Jeffrey Green, and Boon Thau Loo. Datalog and emerging applications: an interactive tutorial. In Timos K. Sellis, Renée J. Miller, Anastasios Kementsietsidis, and Yannis Velegrakis, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12-16, 2011, pages 1213-1216. ACM, 2011. URL: https://doi.org/10.1145/1989323.1989456.
  13. Magnus Madsen and Ondrej Lhoták. Fixpoints for the masses: programming with first-class Datalog constraints. Proc. ACM Program. Lang., 4(OOPSLA):125:1-125:28, 2020. URL: https://doi.org/10.1145/3428193.
  14. Magnus Madsen, Ming-Ho Yee, and Ondrej Lhoták. From Datalog to Flix: A declarative language for fixed points on lattices. In Chandra Krintz and Emery Berger, editors, Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2016, Santa Barbara, CA, USA, June 13-17, 2016, pages 194-208. ACM, 2016. URL: https://doi.org/10.1145/2908080.2908096.
  15. David Maier, K. Tuncay Tekle, Michael Kifer, and David Scott Warren. Datalog: concepts, history, and outlook. In Michael Kifer and Yanhong Annie Liu, editors, Declarative Logic Programming: Theory, Systems, and Applications, pages 3-100. ACM / Morgan & Claypool, 2018. URL: https://doi.org/10.1145/3191315.3191317.
  16. Flemming Nielson, Hanne Riis Nielson, and Chris Hankin. Principles of program analysis. Springer, 1999. Google Scholar
  17. André Pacak, Sebastian Erdweg, and Tamás Szabó. A systematic approach to deriving incremental type checkers. Proc. ACM Program. Lang., 4(OOPSLA):127:1-127:28, 2020. URL: https://doi.org/10.1145/3428195.
  18. John C. Reynolds. Definitional interpreters for higher-order programming languages. High. Order Symb. Comput., 11(4):363-397, 1998. URL: https://doi.org/10.1023/A:1010027404223.
  19. Bernhard Scholz, Herbert Jordan, Pavle Subotic, and Till Westmann. On fast large-scale program analysis in Datalog. In Ayal Zaks and Manuel V. Hermenegildo, editors, Proceedings of the 25th International Conference on Compiler Construction, CC 2016, Barcelona, Spain, March 12-18, 2016, pages 196-206. ACM, 2016. URL: https://doi.org/10.1145/2892208.2892226.
  20. Bernhard Scholz, Kostyantyn Vorobyov, Padmanabhan Krishnan, and Till Westmann. A Datalog source-to-source translator for static program analysis: An experience report. In 24th Australasian Software Engineering Conference, ASWEC 2015, Adelaide, SA, Australia, September 28 - October 1, 2015, pages 28-37. IEEE Computer Society, 2015. URL: https://doi.org/10.1109/ASWEC.2015.15.
  21. Zoltan Somogyi, Fergus Henderson, and Thomas C. Conway. The execution algorithm of mercury, an efficient purely declarative logic programming language. J. Log. Program., 29(1-3):17-64, 1996. URL: https://doi.org/10.1016/S0743-1066(96)00068-4.
  22. Tamás Szabó, Gábor Bergmann, Sebastian Erdweg, and Markus Voelter. Incrementalizing lattice-based program analyses in Datalog. Proc. ACM Program. Lang., 2(OOPSLA):139:1-139:29, 2018. URL: https://doi.org/10.1145/3276509.
  23. Tamás Szabó, Sebastian Erdweg, and Gábor Bergmann. Incremental whole-program analysis in Datalog with lattices. In Programming Language Design and Implementation (PLDI). ACM, 2021. Google Scholar
  24. Tamás Szabó, Sebastian Erdweg, and Markus Voelter. Inca: a DSL for the definition of incremental program analyses. In David Lo, Sven Apel, and Sarfraz Khurshid, editors, Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, Singapore, September 3-7, 2016, pages 320-331. ACM, 2016. URL: https://doi.org/10.1145/2970276.2970298.
  25. K. Tuncay Tekle and Yanhong A. Liu. Precise complexity analysis for efficient datalog queries. In Temur Kutsia, Wolfgang Schreiner, and Maribel Fernández, editors, Proceedings of the 12th International ACM SIGPLAN Conference on Principles and Practice of Declarative Programming, July 26-28, 2010, Hagenberg, Austria, pages 35-44. ACM, 2010. URL: https://doi.org/10.1145/1836089.1836094.
  26. K. Tuncay Tekle and Yanhong A. Liu. Extended magic for negation: Efficient demand-driven evaluation of stratified Datalog with precise complexity guarantees. In Bart Bogaerts, Esra Erdem, Paul Fodor, Andrea Formisano, Giovambattista Ianni, Daniela Inclezan, Germán Vidal, Alicia Villanueva, Marina De Vos, and Fangkai Yang, editors, Proceedings 35th International Conference on Logic Programming (Technical Communications), ICLP 2019 Technical Communications, Las Cruces, NM, USA, September 20-25, 2019, volume 306 of EPTCS, pages 241-254, 2019. URL: https://doi.org/10.4204/EPTCS.306.28.
  27. Jeffrey D. Ullman. Bottom-up beats top-down for Datalog. In Avi Silberschatz, editor, Proceedings of the Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, March 29-31, 1989, Philadelphia, Pennsylvania, USA, pages 140-149. ACM Press, 1989. URL: https://doi.org/10.1145/73721.73736.
  28. Raja Vallee-Rai and Laurie J Hendren. Jimple: Simplifying java bytecode for analyses and transformations, 1998. Google Scholar
  29. Dániel Varró, Gábor Bergmann, Ábel Hegedüs, Ákos Horváth, István Ráth, and Zoltán Ujhelyi. Road to a reactive and incremental model transformation platform: three generations of the VIATRA framework. Software & Systems Modeling, 15(3):609-629, July 2016. URL: https://doi.org/10.1007/s10270-016-0530-4.