Breaking the Negative Cycle: Exploring the Design Space of Stratification for First-Class Datalog Constraints

Authors Jonathan Lindegaard Starup , Magnus Madsen , Ondřej Lhoták



PDF
Thumbnail PDF

File

LIPIcs.ECOOP.2023.31.pdf
  • Filesize: 0.99 MB
  • 28 pages

Document Identifiers

Author Details

Jonathan Lindegaard Starup
  • Department of Computer Science, Aarhus University, Denmark
Magnus Madsen
  • Department of Computer Science, Aarhus University, Denmark
Ondřej Lhoták
  • David R. Cheriton School of Computer Science, University of Waterloo, Canada

Cite As Get BibTex

Jonathan Lindegaard Starup, Magnus Madsen, and Ondřej Lhoták. Breaking the Negative Cycle: Exploring the Design Space of Stratification for First-Class Datalog Constraints. In 37th European Conference on Object-Oriented Programming (ECOOP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 263, pp. 31:1-31:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023) https://doi.org/10.4230/LIPIcs.ECOOP.2023.31

Abstract

The λ_Dat calculus brings together the power of functional and declarative logic programming in one language. In λ_Dat, Datalog constraints are first-class values that can be constructed, passed around as arguments, returned, composed with other constraints, and solved.
A significant part of the expressive power of Datalog comes from the use of negation. Stratified negation is a particularly simple and practical form of negation accessible to ordinary programmers. Stratification requires that Datalog programs must not use recursion through negation.
For a Datalog program, this requirement is straightforward to check, but for a λ_Dat program, it is not so simple: A λ_Dat program constructs, composes, and solves Datalog programs at runtime. Hence stratification cannot readily be determined at compile-time.
In this paper, we explore the design space of stratification for λ_Dat. We investigate strategies to ensure, at compile-time, that programs constructed at runtime are guaranteed to be stratified, and we argue that previous design choices in the Flix programming language have been suboptimal.

Subject Classification

ACM Subject Classification
  • Theory of computation → Constraint and logic programming
Keywords
  • Datalog
  • first-class Datalog constraints
  • negation
  • stratified negation
  • type system
  • row polymorphism
  • the Flix programming language

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Peter Alvaro, William R. Marczak, Neil Conway, Joseph M. Hellerstein, David Maier, and Russell Sears. Dedalus: Datalog in time and space. In Oege de Moor, Georg Gottlob, Tim Furche, and Andrew Sellers, editors, Datalog Reloaded, pages 262-281, Berlin, 2011. Springer Berlin Heidelberg. Google Scholar
  2. Mario Alviano, Wolfgang Faber, Nicola Leone, Simona Perri, Gerald Pfeifer, and Giorgio Terracina. The disjunctive Datalog system DLV. In Oege de Moor, Georg Gottlob, Tim Furche, and Andrew Sellers, editors, Datalog Reloaded, pages 282-301, Berlin, 2011. Springer Berlin Heidelberg. Google Scholar
  3. Molham Aref, Balder ten Cate, Todd J. Green, Benny Kimelfeld, Dan Olteanu, Emir Pasalic, Todd L. Veldhuizen, and Geoffrey Washburn. Design and implementation of the logicblox system. In Proc. of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pages 1371-1382, New York, NY, USA, 2015. Association for Computing Machinery. URL: https://doi.org/10.1145/2723372.2742796.
  4. Pavel Avgustinov, Oege de Moor, Michael Peyton Jones, and Max Schäfer. QL: Object-oriented queries on relational data. In Shriram Krishnamurthi and Benjamin S. Lerner, editors, 30th European Conference on Object-Oriented Programming (ECOOP 2016), volume 56 of Leibniz International Proc. in Informatics (LIPIcs), pages 2:1-2:25, Dagstuhl, Germany, 2016. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. URL: https://doi.org/10.4230/LIPIcs.ECOOP.2016.2.
  5. Vince Bárány, Balder ten Cate, and Martin Otto. Queries with guarded negation. Proc. VLDB Endow., 5(11):1328-1339, July 2012. URL: https://doi.org/10.14778/2350229.2350250.
  6. Aaron Bembenek, Michael Greenberg, and Stephen Chong. Formulog: Datalog for SMT-based static analysis. Proc. ACM Program. Lang., 4(OOPSLA), November 2020. URL: https://doi.org/10.1145/3428209.
  7. Martin Bravenboer and Yannis Smaragdakis. Strictly declarative specification of sophisticated points-to analyses. In Proc. of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA '09, pages 243-262, New York, NY, USA, 2009. Association for Computing Machinery. URL: https://doi.org/10.1145/1640089.1640108.
  8. Andrea Calì, Georg Gottlob, and Thomas Lukasiewicz. A general Datalog-based framework for tractable query answering over ontologies. In Proc. of the Twenty-Eighth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS '09, pages 77-86, New York, NY, USA, 2009. Association for Computing Machinery. URL: https://doi.org/10.1145/1559795.1559809.
  9. Stefano Ceri, Georg Gottlob, and Letizia Tanca. Logic Programming and Databases: An Overview, pages 1-15. Springer Berlin Heidelberg, Berlin, 1990. URL: https://doi.org/10.1007/978-3-642-83952-8_1.
  10. Neil Conway, William R. Marczak, Peter Alvaro, Joseph M. Hellerstein, and David Maier. Logic and lattices for distributed programming. In Proc. of the Third ACM Symposium on Cloud Computing, SoCC '12, New York, NY, USA, 2012. Association for Computing Machinery. URL: https://doi.org/10.1145/2391229.2391230.
  11. Luis Damas. Type assignment in programming languages. PhD thesis, The University of Edinburgh, 1984. Google Scholar
  12. Luis Damas and Robin Milner. Principal type-schemes for functional programs. In Proc. of the 9th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '82, pages 207-212, New York, NY, USA, 1982. Association for Computing Machinery. URL: https://doi.org/10.1145/582153.582176.
  13. Thomas Eiter, Georg Gottlob, and Heikki Mannila. Disjunctive Datalog. ACM Trans. Database Syst., 22(3):364-418, September 1997. URL: https://doi.org/10.1145/261124.261126.
  14. Melvin Fitting. Fixpoint semantics for logic programming a survey. Theoretical Computer Science, 278(1):25-51, 2002. Mathematical Foundations of Programming Semantics 1996. URL: https://doi.org/10.1016/S0304-3975(00)00330-3.
  15. Antonio Flores-Montoya and Eric Schulte. Datalog disassembly. In 29th USENIX Security Symposium (USENIX Security 20), pages 1075-1092, Berkeley, California, USA, August 2020. USENIX Association. Google Scholar
  16. Michael Gelfond and Vladimir Lifschitz. The stable model semantics for logic programming. In Robert Kowalski, Bowen, and Kenneth, editors, Proc. of International Logic Programming Conference and Symposium, pages 1070-1080, Cambridge, MA, USA, 1988. MIT Press. Google Scholar
  17. Michael Gelfond and Vladimir Lifschitz. Classical negation in logic programs and disjunctive databases. New Generation Computing, 9(3):365-385, August 1991. URL: https://doi.org/10.1007/BF03037169.
  18. G. Gottlob, S. Ceri, and L. Tanca. What you always wanted to know about Datalog (and never dared to ask). IEEE Transactions on Knowledge and Data Engineering, 1(01):146-166, 1989. URL: https://doi.org/10.1109/69.43410.
  19. Georg Gottlob and Andreas Pieris. Beyond SPARQL under OWL 2 QL entailment regime: Rules to the rescue. In Proc. of the 24th International Conference on Artificial Intelligence, IJCAI'15, pages 2999-3007, Palo Alto, California, USA, 2015. AAAI Press. Google Scholar
  20. Daniel Halperin, Victor Teixeira de Almeida, Lee Lee Choo, Shumo Chu, Paraschos Koutris, Dominik Moritz, Jennifer Ortiz, Vaspol Ruamviboonsuk, Jingjing Wang, Andrew Whitaker, Shengliang Xu, Magdalena Balazinska, Bill Howe, and Dan Suciu. Demonstration of the Myria big data management service. In Proc. of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD '14, pages 881-884, New York, NY, USA, 2014. Association for Computing Machinery. URL: https://doi.org/10.1145/2588555.2594530.
  21. R. Hindley. The principal type-scheme of an object in combinatory logic. Transactions of the American Mathematical Society, 146:29-60, 1969. URL: https://doi.org/10.2307/1995158.
  22. Herbert Jordan, Bernhard Scholz, and Pavle Subotić. Soufflé: On synthesis of program analyzers. In Swarat Chaudhuri and Azadeh Farzan, editors, Computer Aided Verification, pages 422-430, Cham, 2016. Springer International Publishing. Google Scholar
  23. Mark Kaminski, Bernardo Cuenca Grau, Egor V. Kostylev, Boris Motik, and Ian Horrocks. Foundations of declarative data analysis using limit Datalog programs. In Proc. of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pages 1123-1130, California, USA, 2017. International Joint Conferences on Artificial Intelligence. URL: https://doi.org/10.24963/ijcai.2017/156.
  24. Bas Ketsman, Aws Albarghouthi, and Paraschos Koutris. Distribution policies for Datalog. In Benny Kimelfeld and Yael Amsterdamer, editors, 21st International Conference on Database Theory (ICDT 2018), volume 98 of Leibniz International Proc. in Informatics (LIPIcs), pages 17:1-17:22, Dagstuhl, Germany, 2018. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. URL: https://doi.org/10.4230/LIPIcs.ICDT.2018.17.
  25. Ross D. King. Applying inductive logic programming to predicting gene function. AI Mag., 25(1):57-68, March 2004. Google Scholar
  26. Phokion G. Kolaitis and Christos H. Papadimitriou. Why not negation by fixpoint? In Proc. of the Seventh ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS '88, pages 231-239, New York, NY, USA, 1988. Association for Computing Machinery. URL: https://doi.org/10.1145/308386.308446.
  27. Kenneth Kunen. Negation in logic programming. The Journal of Logic Programming, 4(4):289-308, 1987. URL: https://doi.org/10.1016/0743-1066(87)90007-0.
  28. Daan Leijen. Extensible records with scoped labels. Trends in Functional Programming, 5:297-312, 2005. Google Scholar
  29. Boon Thau Loo, Tyson Condie, Minos Garofalakis, David E. Gay, Joseph M. Hellerstein, Petros Maniatis, Raghu Ramakrishnan, Timothy Roscoe, and Ion Stoica. Declarative networking. Commun. ACM, 52(11):87-95, November 2009. URL: https://doi.org/10.1145/1592761.1592785.
  30. Magnus Madsen and Ondřej Lhoták. Fixpoints for the masses: Programming with first-class Datalog constraints. Proc. ACM Program. Lang., 4(OOPSLA), November 2020. URL: https://doi.org/10.1145/3428193.
  31. Magnus Madsen, Ming-Ho Yee, and Ondřej Lhoták. From Datalog to Flix: A declarative language for fixed points on lattices. In Proc. of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), PLDI '16, pages 194-208, New York, NY, USA, 2016. Association for Computing Machinery. URL: https://doi.org/10.1145/2908080.2908096.
  32. Hongyuan Mei, Guanghui Qin, Minjie Xu, and Jason Eisner. Neural Datalog through time: Informed temporal modeling via logical specification. In International Conference on Machine Learning, pages 6808-6819, Madison, WI, USA, 2020. PMLR, Omnipress. Google Scholar
  33. Robin Milner. A theory of type polymorphism in programming. Journal of Computer and System Sciences, 17(3):348-375, December 1978. URL: https://doi.org/10.1016/0022-0000(78)90014-4.
  34. Raymond J. Mooney. Inductive logic programming for natural language processing. In Stephen Muggleton, editor, Inductive Logic Programming, pages 1-22, Berlin, 1997. Springer Berlin Heidelberg. Google Scholar
  35. Inderpal Singh Mumick, Hamid Pirahesh, and Raghu Ramakrishnan. The magic of duplicates and aggregates. In Proc. of the Sixteenth International Conference on Very Large Databases, pages 264-277, San Francisco, CA, USA, 1990. Morgan Kaufmann Publishers Inc. Google Scholar
  36. Teodor C. Przymusinski. Chapter 5 - on the declarative semantics of deductive databases and logic programs. In Jack Minker, editor, Foundations of Deductive Databases and Logic Programming, pages 193-216. Morgan Kaufmann, USA, 1988. URL: https://doi.org/10.1016/B978-0-934613-40-8.50009-9.
  37. Kenneth A. Ross. Modular stratification and magic sets for DATALOG programs with negation. In Proc. of the Ninth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS '90, pages 161-171, New York, NY, USA, 1990. Association for Computing Machinery. URL: https://doi.org/10.1145/298514.298558.
  38. Jiwon Seo. Datalog extensions for bioinformatic data analysis. Annu Int Conf IEEE Eng Med Biol Soc, 2018:1303-1306, July 2018. Google Scholar
  39. Jiwon Seo, Stephen Guo, and Monica S. Lam. SociaLite: Datalog extensions for efficient social network analysis. In 2013 IEEE 29th International Conference on Data Engineering (ICDE), pages 278-289, Manhattan, New York, USA, 2013. IEEE. URL: https://doi.org/10.1109/ICDE.2013.6544832.
  40. Jiwon Seo, Jongsoo Park, Jaeho Shin, and Monica S. Lam. Distributed Socialite: A Datalog-based language for large-scale graph analysis. Proc. VLDB Endow., 6(14):1906-1917, September 2013. URL: https://doi.org/10.14778/2556549.2556572.
  41. Alexander Shkapsky, Mohan Yang, Matteo Interlandi, Hsuan Chiu, Tyson Condie, and Carlo Zaniolo. Big data analytics with Datalog queries on Spark. In Proc. of the 2016 International Conference on Management of Data, SIGMOD '16, pages 1135-1149, New York, NY, USA, 2016. Association for Computing Machinery. URL: https://doi.org/10.1145/2882903.2915229.
  42. Yannis Smaragdakis and Martin Bravenboer. Using Datalog for fast and easy program analysis. In Oege de Moor, Georg Gottlob, Tim Furche, and Andrew Sellers, editors, Datalog Reloaded, pages 245-251, Berlin, 2011. Springer Berlin Heidelberg. Google Scholar
  43. Tamás Szabó, Gábor Bergmann, Sebastian Erdweg, and Markus Voelter. Incrementalizing lattice-based program analyses in Datalog. Proc. ACM Program. Lang., 2(OOPSLA), October 2018. URL: https://doi.org/10.1145/3276509.
  44. Petar Tsankov. Security analysis of smart contracts in Datalog. In Tiziana Margaria and Bernhard Steffen, editors, Leveraging Applications of Formal Methods, Verification and Validation. Industrial Practice, pages 316-322, Cham, 2018. Springer International Publishing. Google Scholar
  45. Jeffrey D. Ullman. Principles of database and knowledge-base systems, 1988. Google Scholar
  46. Mario Wenzel and Stefan Brass. Declarative programming for microcontrollers - Datalog on Arduino. In Declarative Programming and Knowledge Management: Conference on Declarative Programming, DECLARE 2019, Unifying INAP, WLP, and WFLP, Cottbus, Germany, September 9–12, 2019, Revised Selected Papers, pages 119-138, Berlin, 2019. Springer-Verlag. URL: https://doi.org/10.1007/978-3-030-46714-2_9.
  47. Andrew K Wright and Matthias Felleisen. A syntactic approach to type soundness. Information and computation, 115(1):38-94, 1994. Google Scholar
  48. Carlo Zaniolo, Natraj Arni, and Kayliang Ong. Negation and aggregates in recursive rules: the LDL++ approach. In Stefano Ceri, Katsumi Tanaka, and Shalom Tsur, editors, Deductive and Object-Oriented Databases, pages 204-221, Berlin, 1993. Springer Berlin Heidelberg. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail