5 Search Results for "Lindner, Peter"


Document
The Importance of Parameters in Database Queries

Authors: Martin Grohe, Benny Kimelfeld, Peter Lindner, and Christoph Standke

Published in: LIPIcs, Volume 290, 27th International Conference on Database Theory (ICDT 2024)


Abstract
We propose and study a framework for quantifying the importance of the choices of parameter values to the result of a query over a database. These parameters occur as constants in logical queries, such as conjunctive queries. In our framework, the importance of a parameter is its SHAP score. This score is a popular instantiation of the game-theoretic Shapley value to measuring the importance of feature values in machine learning models. We make the case for the rationale of using this score by explaining the intuition behind SHAP, and by showing that we arrive at this score in two different, apparently opposing, approaches to quantifying the contribution of a parameter. The application of the SHAP score requires two components in addition to the query and the database: (a) a probability distribution over the combinations of parameter values, and (b) a utility function that measures the similarity between the result for the original parameters and the result for hypothetical parameters. The main question addressed in the paper is the complexity of calculating the SHAP score for different distributions and similarity measures. We first address the case of probabilistically independent parameters. The problem is hard if we consider a fragment of queries that is hard to evaluate (as one would expect), and even for the fragment of acyclic conjunctive queries. In some cases, though, one can efficiently list all relevant parameter combinations, and then the SHAP score can be computed in polynomial time under reasonable general conditions. Also tractable is the case of full acyclic conjunctive queries for certain (natural) similarity functions. We extend our results to conjunctive queries with inequalities between variables and parameters. Finally, we discuss a simple approximation technique for the case of correlated parameters.

Cite as

Martin Grohe, Benny Kimelfeld, Peter Lindner, and Christoph Standke. The Importance of Parameters in Database Queries. In 27th International Conference on Database Theory (ICDT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 290, pp. 14:1-14:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{grohe_et_al:LIPIcs.ICDT.2024.14,
  author =	{Grohe, Martin and Kimelfeld, Benny and Lindner, Peter and Standke, Christoph},
  title =	{{The Importance of Parameters in Database Queries}},
  booktitle =	{27th International Conference on Database Theory (ICDT 2024)},
  pages =	{14:1--14:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-312-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{290},
  editor =	{Cormode, Graham and Shekelyan, Michael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2024.14},
  URN =		{urn:nbn:de:0030-drops-197966},
  doi =		{10.4230/LIPIcs.ICDT.2024.14},
  annote =	{Keywords: SHAP score, query parameters, Shapley value}
}
Document
Probabilistic Query Evaluation with Bag Semantics

Authors: Martin Grohe, Peter Lindner, and Christoph Standke

Published in: LIPIcs, Volume 255, 26th International Conference on Database Theory (ICDT 2023)


Abstract
We initiate the study of probabilistic query evaluation under bag semantics where tuples are allowed to be present with duplicates. We focus on self-join free conjunctive queries, and probabilistic databases where occurrences of different facts are independent, which is the natural generalization of tuple-independent probabilistic databases to the bag semantics setting. For set semantics, the data complexity of this problem is well understood, even for the more general class of unions of conjunctive queries: it is either in polynomial time, or #P-hard, depending on the query (Dalvi & Suciu, JACM 2012). Due to potentially unbounded multiplicities, the bag probabilistic databases we discuss are no longer finite objects, which requires a treatment of representation mechanisms. Moreover, the answer to a Boolean query is a probability distribution over non-negative integers, rather than a probability distribution over {true, false}. Therefore, we discuss two flavors of probabilistic query evaluation: computing expectations of answer tuple multiplicities, and computing the probability that a tuple is contained in the answer at most k times for some parameter k. Subject to mild technical assumptions on the representation systems, it turns out that expectations are easy to compute, even for unions of conjunctive queries. For query answer probabilities, we obtain a dichotomy between solvability in polynomial time and #P-hardness for self-join free conjunctive queries.

Cite as

Martin Grohe, Peter Lindner, and Christoph Standke. Probabilistic Query Evaluation with Bag Semantics. In 26th International Conference on Database Theory (ICDT 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 255, pp. 20:1-20:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{grohe_et_al:LIPIcs.ICDT.2023.20,
  author =	{Grohe, Martin and Lindner, Peter and Standke, Christoph},
  title =	{{Probabilistic Query Evaluation with Bag Semantics}},
  booktitle =	{26th International Conference on Database Theory (ICDT 2023)},
  pages =	{20:1--20:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-270-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{255},
  editor =	{Geerts, Floris and Vandevoort, Brecht},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2023.20},
  URN =		{urn:nbn:de:0030-drops-177636},
  doi =		{10.4230/LIPIcs.ICDT.2023.20},
  annote =	{Keywords: Probabilistic Query Evaluation, Probabilistic Databases, Bag Semantics}
}
Document
Database Repairing with Soft Functional Dependencies

Authors: Nofar Carmeli, Martin Grohe, Benny Kimelfeld, Ester Livshits, and Muhammad Tibi

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)


Abstract
A common interpretation of soft constraints penalizes the database for every violation of every constraint, where the penalty is the cost (weight) of the constraint. A computational challenge is that of finding an optimal subset: a collection of database tuples that minimizes the total penalty when each tuple has a cost of being excluded. When the constraints are strict (i.e., have an infinite cost), this subset is a "cardinality repair" of an inconsistent database; in soft interpretations, this subset corresponds to a "most probable world" of a probabilistic database, a "most likely intention" of a probabilistic unclean database, and so on. Within the class of functional dependencies, the complexity of finding a cardinality repair is thoroughly understood. Yet, very little is known about the complexity of finding an optimal subset for the more general soft semantics. This paper makes a significant progress in this direction. In addition to general insights about the hardness and approximability of the problem, we present algorithms for two special cases: a single functional dependency, and a bipartite matching. The latter is the problem of finding an optimal "almost matching" of a bipartite graph where a penalty is paid for every lost edge and every violation of monogamy.

Cite as

Nofar Carmeli, Martin Grohe, Benny Kimelfeld, Ester Livshits, and Muhammad Tibi. Database Repairing with Soft Functional Dependencies. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 16:1-16:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{carmeli_et_al:LIPIcs.ICDT.2021.16,
  author =	{Carmeli, Nofar and Grohe, Martin and Kimelfeld, Benny and Livshits, Ester and Tibi, Muhammad},
  title =	{{Database Repairing with Soft Functional Dependencies}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{16:1--16:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.16},
  URN =		{urn:nbn:de:0030-drops-137245},
  doi =		{10.4230/LIPIcs.ICDT.2021.16},
  annote =	{Keywords: Database inconsistency, database repairs, integrity constraints, soft constraints, functional dependencies}
}
Document
Infinite Probabilistic Databases

Authors: Martin Grohe and Peter Lindner

Published in: LIPIcs, Volume 155, 23rd International Conference on Database Theory (ICDT 2020)


Abstract
Probabilistic databases (PDBs) are used to model uncertainty in data in a quantitative way. In the standard formal framework, PDBs are finite probability spaces over relational database instances. It has been argued convincingly that this is not compatible with an open-world semantics (Ceylan et al., KR 2016) and with application scenarios that are modeled by continuous probability distributions (Dalvi et al., CACM 2009). We recently introduced a model of PDBs as infinite probability spaces that addresses these issues (Grohe and Lindner, PODS 2019). While that work was mainly concerned with countably infinite probability spaces, our focus here is on uncountable spaces. Such an extension is necessary to model typical continuous probability distributions that appear in many applications. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics. It turns out that so-called finite point processes are the appropriate model from probability theory for dealing with probabilistic databases. This model allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and Datalog queries.

Cite as

Martin Grohe and Peter Lindner. Infinite Probabilistic Databases. In 23rd International Conference on Database Theory (ICDT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 155, pp. 16:1-16:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{grohe_et_al:LIPIcs.ICDT.2020.16,
  author =	{Grohe, Martin and Lindner, Peter},
  title =	{{Infinite Probabilistic Databases}},
  booktitle =	{23rd International Conference on Database Theory (ICDT 2020)},
  pages =	{16:1--16:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-139-9},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{155},
  editor =	{Lutz, Carsten and Jung, Jean Christoph},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2020.16},
  URN =		{urn:nbn:de:0030-drops-119400},
  doi =		{10.4230/LIPIcs.ICDT.2020.16},
  annote =	{Keywords: Probabilistic Databases, Possible Worlds Semantics, Query Measurability, Relational Algebra, Aggregate Queries}
}
Document
Climbing up the Elementary Complexity Classes with Theories of Automatic Structures

Authors: Faried Abu Zaid, Dietrich Kuske, and Peter Lindner

Published in: LIPIcs, Volume 119, 27th EACSL Annual Conference on Computer Science Logic (CSL 2018)


Abstract
Automatic structures are structures that admit a finite presentation via automata. Their most prominent feature is that their theories are decidable. In the literature, one finds automatic structures with non-elementary theory (e.g., the complete binary tree with equal-level predicate) and automatic structures whose theories are at most 3-fold exponential (e.g., Presburger arithmetic or infinite automatic graphs of bounded degree). This observation led Durand-Gasselin to the question whether there are automatic structures of arbitrary high elementary complexity. We give a positive answer to this question. Namely, we show that for every h >=0 the forest of (infinitely many copies of) all finite trees of height at most h+2 is automatic and it's theory is complete for STA(*, exp_h(n, poly(n)), poly(n)), an alternating complexity class between h-fold exponential time and space. This exact determination of the complexity of the theory of these forests might be of independent interest.

Cite as

Faried Abu Zaid, Dietrich Kuske, and Peter Lindner. Climbing up the Elementary Complexity Classes with Theories of Automatic Structures. In 27th EACSL Annual Conference on Computer Science Logic (CSL 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 119, pp. 3:1-3:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{abuzaid_et_al:LIPIcs.CSL.2018.3,
  author =	{Abu Zaid, Faried and Kuske, Dietrich and Lindner, Peter},
  title =	{{Climbing up the Elementary Complexity Classes with Theories of Automatic Structures}},
  booktitle =	{27th EACSL Annual Conference on Computer Science Logic (CSL 2018)},
  pages =	{3:1--3:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-088-0},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{119},
  editor =	{Ghica, Dan R. and Jung, Achim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CSL.2018.3},
  URN =		{urn:nbn:de:0030-drops-96701},
  doi =		{10.4230/LIPIcs.CSL.2018.3},
  annote =	{Keywords: Automatic Structures, Complexity Theory, Model Theory}
}
  • Refine by Author
  • 4 Grohe, Martin
  • 4 Lindner, Peter
  • 2 Kimelfeld, Benny
  • 2 Standke, Christoph
  • 1 Abu Zaid, Faried
  • Show More...

  • Refine by Classification
  • 3 Theory of computation → Incomplete, inconsistent, and uncertain databases
  • 2 Theory of computation → Database query languages (principles)
  • 1 Information systems → Data cleaning
  • 1 Mathematics of computing → Probabilistic representations
  • 1 Theory of computation → Complexity theory and logic

  • Refine by Keyword
  • 2 Probabilistic Databases
  • 1 Aggregate Queries
  • 1 Automatic Structures
  • 1 Bag Semantics
  • 1 Complexity Theory
  • Show More...

  • Refine by Type
  • 5 document

  • Refine by Publication Year
  • 1 2018
  • 1 2020
  • 1 2021
  • 1 2023
  • 1 2024

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail