Document

**Published in:** LIPIcs, Volume 255, 26th International Conference on Database Theory (ICDT 2023)

We initiate the study of probabilistic query evaluation under bag semantics where tuples are allowed to be present with duplicates. We focus on self-join free conjunctive queries, and probabilistic databases where occurrences of different facts are independent, which is the natural generalization of tuple-independent probabilistic databases to the bag semantics setting. For set semantics, the data complexity of this problem is well understood, even for the more general class of unions of conjunctive queries: it is either in polynomial time, or #P-hard, depending on the query (Dalvi & Suciu, JACM 2012).
Due to potentially unbounded multiplicities, the bag probabilistic databases we discuss are no longer finite objects, which requires a treatment of representation mechanisms. Moreover, the answer to a Boolean query is a probability distribution over non-negative integers, rather than a probability distribution over {true, false}. Therefore, we discuss two flavors of probabilistic query evaluation: computing expectations of answer tuple multiplicities, and computing the probability that a tuple is contained in the answer at most k times for some parameter k. Subject to mild technical assumptions on the representation systems, it turns out that expectations are easy to compute, even for unions of conjunctive queries. For query answer probabilities, we obtain a dichotomy between solvability in polynomial time and #P-hardness for self-join free conjunctive queries.

Martin Grohe, Peter Lindner, and Christoph Standke. Probabilistic Query Evaluation with Bag Semantics. In 26th International Conference on Database Theory (ICDT 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 255, pp. 20:1-20:19, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{grohe_et_al:LIPIcs.ICDT.2023.20, author = {Grohe, Martin and Lindner, Peter and Standke, Christoph}, title = {{Probabilistic Query Evaluation with Bag Semantics}}, booktitle = {26th International Conference on Database Theory (ICDT 2023)}, pages = {20:1--20:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-270-9}, ISSN = {1868-8969}, year = {2023}, volume = {255}, editor = {Geerts, Floris and Vandevoort, Brecht}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2023.20}, URN = {urn:nbn:de:0030-drops-177636}, doi = {10.4230/LIPIcs.ICDT.2023.20}, annote = {Keywords: Probabilistic Query Evaluation, Probabilistic Databases, Bag Semantics} }

Document

**Published in:** LIPIcs, Volume 155, 23rd International Conference on Database Theory (ICDT 2020)

Probabilistic databases (PDBs) are used to model uncertainty in data in a quantitative way. In the standard formal framework, PDBs are finite probability spaces over relational database instances. It has been argued convincingly that this is not compatible with an open-world semantics (Ceylan et al., KR 2016) and with application scenarios that are modeled by continuous probability distributions (Dalvi et al., CACM 2009).
We recently introduced a model of PDBs as infinite probability spaces that addresses these issues (Grohe and Lindner, PODS 2019). While that work was mainly concerned with countably infinite probability spaces, our focus here is on uncountable spaces. Such an extension is necessary to model typical continuous probability distributions that appear in many applications. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics.
It turns out that so-called finite point processes are the appropriate model from probability theory for dealing with probabilistic databases. This model allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and Datalog queries.

Martin Grohe and Peter Lindner. Infinite Probabilistic Databases. In 23rd International Conference on Database Theory (ICDT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 155, pp. 16:1-16:20, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{grohe_et_al:LIPIcs.ICDT.2020.16, author = {Grohe, Martin and Lindner, Peter}, title = {{Infinite Probabilistic Databases}}, booktitle = {23rd International Conference on Database Theory (ICDT 2020)}, pages = {16:1--16:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-139-9}, ISSN = {1868-8969}, year = {2020}, volume = {155}, editor = {Lutz, Carsten and Jung, Jean Christoph}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2020.16}, URN = {urn:nbn:de:0030-drops-119400}, doi = {10.4230/LIPIcs.ICDT.2020.16}, annote = {Keywords: Probabilistic Databases, Possible Worlds Semantics, Query Measurability, Relational Algebra, Aggregate Queries} }

Document

**Published in:** LIPIcs, Volume 119, 27th EACSL Annual Conference on Computer Science Logic (CSL 2018)

Automatic structures are structures that admit a finite presentation via automata. Their most prominent feature is that their theories are decidable. In the literature, one finds automatic structures with non-elementary theory (e.g., the complete binary tree with equal-level predicate) and automatic structures whose theories are at most 3-fold exponential (e.g., Presburger arithmetic or infinite automatic graphs of bounded degree). This observation led Durand-Gasselin to the question whether there are automatic structures of arbitrary high elementary complexity.
We give a positive answer to this question. Namely, we show that for every h >=0 the forest of (infinitely many copies of) all finite trees of height at most h+2 is automatic and it's theory is complete for STA(*, exp_h(n, poly(n)), poly(n)), an alternating complexity class between h-fold exponential time and space. This exact determination of the complexity of the theory of these forests might be of independent interest.

Faried Abu Zaid, Dietrich Kuske, and Peter Lindner. Climbing up the Elementary Complexity Classes with Theories of Automatic Structures. In 27th EACSL Annual Conference on Computer Science Logic (CSL 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 119, pp. 3:1-3:16, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{abuzaid_et_al:LIPIcs.CSL.2018.3, author = {Abu Zaid, Faried and Kuske, Dietrich and Lindner, Peter}, title = {{Climbing up the Elementary Complexity Classes with Theories of Automatic Structures}}, booktitle = {27th EACSL Annual Conference on Computer Science Logic (CSL 2018)}, pages = {3:1--3:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-088-0}, ISSN = {1868-8969}, year = {2018}, volume = {119}, editor = {Ghica, Dan R. and Jung, Achim}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CSL.2018.3}, URN = {urn:nbn:de:0030-drops-96701}, doi = {10.4230/LIPIcs.CSL.2018.3}, annote = {Keywords: Automatic Structures, Complexity Theory, Model Theory} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail