On the Convergence Rate of Linear Datalog ^∘ over Stable Semirings

Authors Sungjin Im, Benjamin Moseley, Hung Ngo, Kirk Pruhs



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2024.11.pdf
  • Filesize: 0.8 MB
  • 20 pages

Document Identifiers

Author Details

Sungjin Im
  • University of California, Merced, CA, USA
Benjamin Moseley
  • Carnegie Mellon University, Pittsburgh, PA, USA
Hung Ngo
  • RelationalAi, Berkeley, CA, USA
Kirk Pruhs
  • University of Pittsburgh, Pittsburgh, PA, USA

Cite AsGet BibTex

Sungjin Im, Benjamin Moseley, Hung Ngo, and Kirk Pruhs. On the Convergence Rate of Linear Datalog ^∘ over Stable Semirings. In 27th International Conference on Database Theory (ICDT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 290, pp. 11:1-11:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ICDT.2024.11

Abstract

Datalog^∘ is an extension of Datalog, where instead of a program being a collection of union of conjunctive queries over the standard Boolean semiring, a program may now be a collection of sum-product queries over an arbitrary commutative partially ordered pre-semiring. Datalog^∘ is more powerful than Datalog in that its additional algebraic structure alows for supporting recursion with aggregation. At the same time, Datalog^∘ retains the syntactic and semantic simplicity of Datalog: Datalog^∘ has declarative least fixpoint semantics. The least fixpoint can be found via the naïve evaluation algorithm that repeatedly applies the immediate consequence operator until no further change is possible. It was shown in [Mahmoud Abo Khamis et al., 2022] that, when the underlying semiring is p-stable, then the naïve evaluation of any Datalog^∘ program over the semiring converges in a finite number of steps. However, the upper bounds on the rate of convergence were exponential in the number n of ground IDB atoms. This paper establishes polynomial upper bounds on the convergence rate of the naïve algorithm on linear Datalog^∘ programs, which is quite common in practice. In particular, the main result of this paper is that the convergence rate of linear Datalog^∘ programs under any p-stable semiring is O(pn³). Furthermore, we show a matching lower bound by constructing a p-stable semiring and a linear Datalog^∘ program that requires Ω(pn³) iterations for the naïve iteration algorithm to converge. Next, we study the convergence rate in terms of the number of elements in the semiring for linear Datalog^∘ programs. When L is the number of elements, the convergence rate is bounded by O(pn log L). This significantly improves the convergence rate for small L. We show a nearly matching lower bound as well.

Subject Classification

ACM Subject Classification
  • Theory of computation → Database query languages (principles)
Keywords
  • Datalog
  • convergence rate
  • semiring

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. URL: http://webdam.inria.fr/Alice/.
  2. R. C. Backhouse and B. A. Carré. Regular algebra applied to path-finding problems. J. Inst. Math. Appl., 15:161-186, 1975. Google Scholar
  3. Bernard Carré. Graphs and networks. The Clarendon Press, Oxford University Press, New York, 1979. Oxford Applied Mathematics and Computing Science Series. Google Scholar
  4. Patrick Cousot and Radhia Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Robert M. Graham, Michael A. Harrison, and Ravi Sethi, editors, Conference Record of the Fourth ACM Symposium on Principles of Programming Languages, Los Angeles, California, USA, January 1977, pages 238-252. ACM, 1977. URL: https://doi.org/10.1145/512950.512973.
  5. Robert W. Floyd. Algorithm 97: Shortest path. Commun. ACM, 5(6):345, 1962. URL: https://doi.org/10.1145/367766.368168.
  6. M. Gondran. Algèbre linéaire et cheminement dans un graphe. Rev. Française Automat. Informat. Recherche Opérationnelle Sér. Verte, 9(V-1):77-99, 1975. Google Scholar
  7. Michel Gondran and Michel Minoux. Graphs, dioids and semirings, volume 41 of Operations Research/Computer Science Interfaces Series. Springer, New York, 2008. New models and algorithms. Google Scholar
  8. Mark W. Hopkins and Dexter Kozen. Parikh’s theorem in commutative kleene algebra. In 14th Annual IEEE Symposium on Logic in Computer Science, Trento, Italy, July 2-5, 1999, pages 394-401. IEEE Computer Society, 1999. URL: https://doi.org/10.1109/LICS.1999.782634.
  9. John B. Kam and Jeffrey D. Ullman. Global data flow analysis and iterative algorithms. J. ACM, 23(1):158-171, 1976. URL: https://doi.org/10.1145/321921.321938.
  10. Mahmoud Abo Khamis, Hung Q. Ngo, Reinhard Pichler, Dan Suciu, and Yisu Remy Wang. Convergence of datalog over (pre-) semirings. In Leonid Libkin and Pablo Barceló, editors, PODS '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, pages 105-117. ACM, 2022. URL: https://doi.org/10.1145/3517804.3524140.
  11. S. C. Kleene. Representation of events in nerve nets and finite automata. In Automata studies, Annals of mathematics studies, no. 34, pages 3-41. Princeton University Press, Princeton, N. J., 1956. Google Scholar
  12. Werner Kuich. Semirings and formal power series: their relevance to formal languages and automata. In Handbook of formal languages, Vol. 1, pages 609-677. Springer, Berlin, 1997. URL: https://doi.org/10.1007/978-3-642-59136-5_9.
  13. Daniel J. Lehmann. Algebraic structures for transitive closure. Theor. Comput. Sci., 4(1):59-76, 1977. URL: https://doi.org/10.1016/0304-3975(77)90056-1.
  14. Richard J. Lipton, Donald J. Rose, and Robert Endre Tarjan. Generalized nested dissection. SIAM J. Numer. Anal., 16(2):346-358, 1979. URL: https://doi.org/10.1137/0716027.
  15. Richard J. Lipton and Robert Endre Tarjan. Applications of a planar separator theorem. SIAM J. Comput., 9(3):615-627, 1980. URL: https://doi.org/10.1137/0209046.
  16. Flemming Nielson, Hanne Riis Nielson, and Chris Hankin. Principles of program analysis. Springer-Verlag, Berlin, 1999. URL: https://doi.org/10.1007/978-3-662-03811-6.
  17. Günter Rote. Path problems in graphs. In Computational graph theory, volume 7 of Comput. Suppl., pages 155-189. Springer, Vienna, 1990. URL: https://doi.org/10.1007/978-3-7091-9076-0_9.
  18. Robert E. Tarjan. Graph theory and gaussian elimination, 1976. J.R. Bunch and D.J. Rose, eds. Google Scholar
  19. Stephen Warshall. A theorem on boolean matrices. J. ACM, 9(1):11-12, 1962. URL: https://doi.org/10.1145/321105.321107.
  20. Robin J. Wilson. Introduction to Graph Theory. Prentice Hall/Pearson, New York, 2010. Google Scholar
  21. U. Zimmermann. Linear and combinatorial optimization in ordered algebraic structures. Ann. Discrete Math., 10:viii+380, 1981. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail