Datalog: Bag Semantics via Set Semantics

Authors Leopoldo Bertossi, Georg Gottlob, Reinhard Pichler

Leopoldo Bertossi
  • RelationalAI Inc., USA
  • Carleton University, Ottawa, Canada
  • Member of the "Millenium Institute for Foundational Research on Data" (IMFD, Chile)
Georg Gottlob
  • University of Oxford, UK
  • TU Wien, Austria
Reinhard Pichler
  • TU Wien, Austria


Many thanks to Renzo Angles and Claudio Gutierrez for information on their work on SPARQL with bag semantics; and to Wolfgang Fischl for his help testing some queries in SQL DBMSs. We appreciate the useful comments received from the reviewers. Part of this work was done while L. Bertossi was spending a sabbatical at the DBAI Group of TU Wien with support from the "Vienna Center for Logic and Algorithms" and the Wolfgang Pauli Society. This author is grateful for their support and hospitality, and specially to G. Gottlob for making the stay possible. He was also supported by NSERC Discovery Grant #06148. The work of G. Gottlob was supported by the Austrian Science Fund (FWF):P30930 and the EPSRC programme grant EP/M025268/1 VADA. The work of R. Pichler was supported by the Austrian Science Fund (FWF):P30930.

Leopoldo Bertossi, Georg Gottlob, and Reinhard Pichler. Datalog: Bag Semantics via Set Semantics. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 16:1-16:19, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)


Duplicates in data management are common and problematic. In this work, we present a translation of Datalog under bag semantics into a well-behaved extension of Datalog, the so-called warded Datalog^+/-, under set semantics. From a theoretical point of view, this allows us to reason on bag semantics by making use of the well-established theoretical foundations of set semantics. From a practical point of view, this allows us to handle the bag semantics of Datalog by powerful, existing query engines for the required extension of Datalog. This use of Datalog^+/- is extended to give a set semantics to duplicates in Datalog^+/- itself. We investigate the properties of the resulting Datalog^+/- programs, the problem of deciding multiplicities, and expressibility of some bag operations. Moreover, the proposed translation has the potential for interesting applications such as to Multiset Relational Algebra and the semantic web query language SPARQL with bag semantics.

ACM Subject Classification
  • Information systems → Query languages
  • Theory of computation → Logic
  • Theory of computation → Semantics and reasoning
  • Datalog
  • duplicates
  • multisets
  • query answering
  • chase
  • Datalog+/-


