DROPS

Document

DOI: 10.4230/LIPIcs.ICDT.2024.7

Skyline Operators for Document Spanners

Authors: Antoine Amarilli, Benny Kimelfeld, Sébastien Labbé, and Stefan Mengel

Published in: LIPIcs, Volume 290, 27th International Conference on Database Theory (ICDT 2024)

Abstract

When extracting a relation of spans (intervals) from a text document, a common practice is to filter out tuples of the relation that are deemed dominated by others. The domination rule is defined as a partial order that varies along different systems and tasks. For example, we may state that a tuple is dominated by tuples that extend it by assigning additional attributes, or assigning larger intervals. The result of filtering the relation would then be the skyline according to this partial order. As this filtering may remove most of the extracted tuples, we study whether we can improve the performance of the extraction by compiling the domination rule into the extractor. To this aim, we introduce the skyline operator for declarative information extraction tasks expressed as document spanners. We show that this operator can be expressed via regular operations when the domination partial order can itself be expressed as a regular spanner, which covers several natural domination rules. Yet, we show that the skyline operator incurs a computational cost (under combined complexity). First, there are cases where the operator requires an exponential blowup on the number of states needed to represent the spanner as a sequential variable-set automaton. Second, the evaluation may become computationally hard. Our analysis more precisely identifies classes of domination rules for which the combined complexity is tractable or intractable.

Cite as

Antoine Amarilli, Benny Kimelfeld, Sébastien Labbé, and Stefan Mengel. Skyline Operators for Document Spanners. In 27th International Conference on Database Theory (ICDT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 290, pp. 7:1-7:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2024.7,
  author =	{Amarilli, Antoine and Kimelfeld, Benny and Labb\'{e}, S\'{e}bastien and Mengel, Stefan},
  title =	{{Skyline Operators for Document Spanners}},
  booktitle =	{27th International Conference on Database Theory (ICDT 2024)},
  pages =	{7:1--7:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-312-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{290},
  editor =	{Cormode, Graham and Shekelyan, Michael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2024.7},
  URN =		{urn:nbn:de:0030-drops-197898},
  doi =		{10.4230/LIPIcs.ICDT.2024.7},
  annote =	{Keywords: Information Extraction, Document Spanners, Query Evaluation}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2024.15

Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability

Authors: Antoine Amarilli, Timothy van Bremen, and Kuldeep S. Meel

Published in: LIPIcs, Volume 290, 27th International Conference on Database Theory (ICDT 2024)

Abstract

Query evaluation over probabilistic databases is a notoriously intractable problem - not only in combined complexity, but for many natural queries in data complexity as well [Antoine Amarilli et al., 2017; Nilesh N. Dalvi and Dan Suciu, 2012]. This motivates the study of probabilistic query evaluation through the lens of approximation algorithms, and particularly of combined FPRASes, whose runtime is polynomial in both the query and instance size. In this paper, we focus on tuple-independent probabilistic databases over binary signatures, which can be equivalently viewed as probabilistic graphs. We study in which cases we can devise combined FPRASes for probabilistic query evaluation in this setting. We settle the complexity of this problem for a variety of query and instance classes, by proving both approximability and (conditional) inapproximability results. This allows us to deduce many corollaries of possible independent interest. For example, we show how the results of [Marcelo Arenas et al., 2021] on counting fixed-length strings accepted by an NFA imply the existence of an FPRAS for the two-terminal network reliability problem on directed acyclic graphs: this was an open problem until now [Rico Zenklusen and Marco Laumanns, 2011]. We also show that one cannot extend a recent result [Timothy van Bremen and Kuldeep S. Meel, 2023] that gives a combined FPRAS for self-join-free conjunctive queries of bounded hypertree width on probabilistic databases: neither the bounded-hypertree-width condition nor the self-join-freeness hypothesis can be relaxed. Finally, we complement all our inapproximability results with unconditional lower bounds, showing that DNNF provenance circuits must have at least moderately exponential size in combined complexity.

Cite as

Antoine Amarilli, Timothy van Bremen, and Kuldeep S. Meel. Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability. In 27th International Conference on Database Theory (ICDT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 290, pp. 15:1-15:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2024.15,
  author =	{Amarilli, Antoine and van Bremen, Timothy and Meel, Kuldeep S.},
  title =	{{Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability}},
  booktitle =	{27th International Conference on Database Theory (ICDT 2024)},
  pages =	{15:1--15:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-312-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{290},
  editor =	{Cormode, Graham and Shekelyan, Michael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2024.15},
  URN =		{urn:nbn:de:0030-drops-197978},
  doi =		{10.4230/LIPIcs.ICDT.2024.15},
  annote =	{Keywords: Probabilistic query evaluation, tuple-independent databases, approximation}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2024.25

Ranked Enumeration for MSO on Trees via Knowledge Compilation

Authors: Antoine Amarilli, Pierre Bourhis, Florent Capelli, and Mikaël Monet

Published in: LIPIcs, Volume 290, 27th International Conference on Database Theory (ICDT 2024)

Abstract

We study the problem of enumerating the satisfying assignments for certain circuit classes from knowledge compilation, where assignments are ranked in a specific order. In particular, we show how this problem can be used to efficiently perform ranked enumeration of the answers to MSO queries over trees, with the order being given by a ranking function satisfying a subset-monotonicity property. Assuming that the number of variables is constant, we show that we can enumerate the satisfying assignments in ranked order for so-called multivalued circuits that are smooth, decomposable, and in negation normal form (smooth multivalued DNNF). There is no preprocessing and the enumeration delay is linear in the size of the circuit times the number of values, plus a logarithmic term in the number of assignments produced so far. If we further assume that the circuit is deterministic (smooth multivalued d-DNNF), we can achieve linear-time preprocessing in the circuit, and the delay only features the logarithmic term.

Cite as

Antoine Amarilli, Pierre Bourhis, Florent Capelli, and Mikaël Monet. Ranked Enumeration for MSO on Trees via Knowledge Compilation. In 27th International Conference on Database Theory (ICDT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 290, pp. 25:1-25:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2024.25,
  author =	{Amarilli, Antoine and Bourhis, Pierre and Capelli, Florent and Monet, Mika\"{e}l},
  title =	{{Ranked Enumeration for MSO on Trees via Knowledge Compilation}},
  booktitle =	{27th International Conference on Database Theory (ICDT 2024)},
  pages =	{25:1--25:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-312-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{290},
  editor =	{Cormode, Graham and Shekelyan, Michael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2024.25},
  URN =		{urn:nbn:de:0030-drops-198079},
  doi =		{10.4230/LIPIcs.ICDT.2024.25},
  annote =	{Keywords: Enumeration, knowledge compilation, monadic second-order logic}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2023.14

Uniform Reliability for Unbounded Homomorphism-Closed Graph Queries

Authors: Antoine Amarilli

Published in: LIPIcs, Volume 255, 26th International Conference on Database Theory (ICDT 2023)

Abstract

We study the uniform query reliability problem, which asks, for a fixed Boolean query Q, given an instance I, how many subinstances of I satisfy Q. Equivalently, this is a restricted case of Boolean query evaluation on tuple-independent probabilistic databases where all facts must have probability 1/2. We focus on graph signatures, and on queries closed under homomorphisms. We show that for any such query that is unbounded, i.e., not equivalent to a union of conjunctive queries, the uniform reliability problem is #P-hard. This recaptures the hardness, e.g., of s-t connectedness, which counts how many subgraphs of an input graph have a path between a source and a sink. This new hardness result on uniform reliability strengthens our earlier hardness result on probabilistic query evaluation for unbounded homomorphism-closed queries [Amarilli and Ceylan, 2021]. Indeed, our earlier proof crucially used facts with probability 1, so it did not apply to the unweighted case. The new proof presented in this paper avoids this; it uses our recent hardness result on uniform reliability for non-hierarchical conjunctive queries without self-joins [Antoine Amarilli and Benny Kimelfeld, 2022], along with new techniques.

Cite as

Antoine Amarilli. Uniform Reliability for Unbounded Homomorphism-Closed Graph Queries. In 26th International Conference on Database Theory (ICDT 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 255, pp. 14:1-14:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{amarilli:LIPIcs.ICDT.2023.14,
  author =	{Amarilli, Antoine},
  title =	{{Uniform Reliability for Unbounded Homomorphism-Closed Graph Queries}},
  booktitle =	{26th International Conference on Database Theory (ICDT 2023)},
  pages =	{14:1--14:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-270-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{255},
  editor =	{Geerts, Floris and Vandevoort, Brecht},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2023.14},
  URN =		{urn:nbn:de:0030-drops-177566},
  doi =		{10.4230/LIPIcs.ICDT.2023.14},
  annote =	{Keywords: Uniform reliability, #P-hardness, probabilistic databases}
}

Document

DOI: 10.4230/LIPIcs.STACS.2023.8

Enumerating Regular Languages with Bounded Delay

Authors: Antoine Amarilli and Mikaël Monet

Published in: LIPIcs, Volume 254, 40th International Symposium on Theoretical Aspects of Computer Science (STACS 2023)

Abstract

We study the task, for a given language L, of enumerating the (generally infinite) sequence of its words, without repetitions, while bounding the delay between two consecutive words. To allow for delay bounds that do not depend on the current word length, we assume a model where we produce each word by editing the preceding word with a small edit script, rather than writing out the word from scratch. In particular, this witnesses that the language is orderable, i.e., we can write its words as an infinite sequence such that the Levenshtein edit distance between any two consecutive words is bounded by a value that depends only on the language. For instance, (a+b)^* is orderable (with a variant of the Gray code), but a^* + b^* is not. We characterize which regular languages are enumerable in this sense, and show that this can be decided in PTIME in an input deterministic finite automaton (DFA) for the language. In fact, we show that, given a DFA A, we can compute in PTIME automata A₁, …, A_t such that L(A) is partitioned as L(A₁) ⊔ … ⊔ L(A_t) and every L(A_i) is orderable in this sense. Further, we show that the value of t obtained is optimal, i.e., we cannot partition L(A) into less than t orderable languages. In the case where L(A) is orderable (i.e., t = 1), we show that the ordering can be produced by a bounded-delay algorithm: specifically, the algorithm runs in a suitable pointer machine model, and produces a sequence of bounded-length edit scripts to visit the words of L(A) without repetitions, with bounded delay - exponential in |A| - between each script. In fact, we show that we can achieve this while only allowing the edit operations push and pop at the beginning and end of the word, which implies that the word can in fact be maintained in a double-ended queue. By contrast, when fixing the distance bound d between consecutive words and the number of classes of the partition, it is NP-hard in the input DFA A to decide if L(A) is orderable in this sense, already for finite languages. Last, we study the model where push-pop edits are only allowed at the end of the word, corresponding to a case where the word is maintained on a stack. We show that these operations are strictly weaker and that the slender languages are precisely those that can be partitioned into finitely many languages that are orderable in this sense. For the slender languages, we can again characterize the minimal number of languages in the partition, and achieve bounded-delay enumeration.

Cite as

Antoine Amarilli and Mikaël Monet. Enumerating Regular Languages with Bounded Delay. In 40th International Symposium on Theoretical Aspects of Computer Science (STACS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 254, pp. 8:1-8:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.STACS.2023.8,
  author =	{Amarilli, Antoine and Monet, Mika\"{e}l},
  title =	{{Enumerating Regular Languages with Bounded Delay}},
  booktitle =	{40th International Symposium on Theoretical Aspects of Computer Science (STACS 2023)},
  pages =	{8:1--8:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-266-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{254},
  editor =	{Berenbrink, Petra and Bouyer, Patricia and Dawar, Anuj and Kant\'{e}, Mamadou Moustapha},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2023.8},
  URN =		{urn:nbn:de:0030-drops-176609},
  doi =		{10.4230/LIPIcs.STACS.2023.8},
  annote =	{Keywords: Regular language, constant-delay enumeration, edit distance}
}

Document

DOI: 10.4230/LIPIcs.MFCS.2022.9

Weighted Counting of Matchings in Unbounded-Treewidth Graph Families

Authors: Antoine Amarilli and Mikaël Monet

Published in: LIPIcs, Volume 241, 47th International Symposium on Mathematical Foundations of Computer Science (MFCS 2022)

Abstract

We consider a weighted counting problem on matchings, denoted PrMatching(𝒢), on an arbitrary fixed graph family 𝒢. The input consists of a graph G ∈ 𝒢 and of rational probabilities of existence on every edge of G, assuming independence. The output is the probability of obtaining a matching of G in the resulting distribution, i.e., a set of edges that are pairwise disjoint. It is known that, if 𝒢 has bounded treewidth, then PrMatching(𝒢) can be solved in polynomial time. In this paper we show that, under some assumptions, bounded treewidth in fact characterizes the tractable graph families for this problem. More precisely, we show intractability for all graph families 𝒢 satisfying the following treewidth-constructibility requirement: given an integer k in unary, we can construct in polynomial time a graph G ∈ 𝒢 with treewidth at least k. Our hardness result is then the following: for any treewidth-constructible graph family 𝒢, the problem PrMatching(𝒢) is intractable. This generalizes known hardness results for weighted matching counting under some restrictions that do not bound treewidth, e.g., being planar, 3-regular, or bipartite; it also answers a question left open in [Amarilli et al., 2016]. We also obtain a similar lower bound for the weighted counting of edge covers.

Cite as

Antoine Amarilli and Mikaël Monet. Weighted Counting of Matchings in Unbounded-Treewidth Graph Families. In 47th International Symposium on Mathematical Foundations of Computer Science (MFCS 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 241, pp. 9:1-9:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.MFCS.2022.9,
  author =	{Amarilli, Antoine and Monet, Mika\"{e}l},
  title =	{{Weighted Counting of Matchings in Unbounded-Treewidth Graph Families}},
  booktitle =	{47th International Symposium on Mathematical Foundations of Computer Science (MFCS 2022)},
  pages =	{9:1--9:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-256-3},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{241},
  editor =	{Szeider, Stefan and Ganian, Robert and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2022.9},
  URN =		{urn:nbn:de:0030-drops-168078},
  doi =		{10.4230/LIPIcs.MFCS.2022.9},
  annote =	{Keywords: Treewidth, counting complexity, matchings, Fibonacci sequence}
}

Document

Track B: Automata, Logic, Semantics, and Theory of Programming

DOI: 10.4230/LIPIcs.ICALP.2021.116

Dynamic Membership for Regular Languages

Authors: Antoine Amarilli, Louis Jachiet, and Charles Paperman

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)

Abstract

We study the dynamic membership problem for regular languages: fix a language L, read a word w, build in time O(|w|) a data structure indicating if w is in L, and maintain this structure efficiently under letter substitutions on w. We consider this problem on the unit cost RAM model with logarithmic word length, where the problem always has a solution in O(log|w| / log log|w|) per operation. We show that the problem is in O(log log|w|) for languages in an algebraically-defined, decidable class QSG, and that it is in O(1) for another such class QLZG. We show that languages not in QSG admit a reduction from the prefix problem for a cyclic group, so that they require Ω(log|w| /log log|w|) operations in the worst case; and that QSG languages not in QLZG admit a reduction from the prefix problem for the multiplicative monoid U₁ = {0, 1}, which we conjecture cannot be maintained in O(1). This yields a conditional trichotomy. We also investigate intermediate cases between O(1) and O(log log|w|). Our results are shown via the dynamic word problem for monoids and semigroups, for which we also give a classification. We thus solve open problems of the paper of Skovbjerg Frandsen, Miltersen, and Skyum [Skovbjerg Frandsen et al., 1997] on the dynamic word problem, and additionally cover regular languages.

Cite as

Antoine Amarilli, Louis Jachiet, and Charles Paperman. Dynamic Membership for Regular Languages. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 116:1-116:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICALP.2021.116,
  author =	{Amarilli, Antoine and Jachiet, Louis and Paperman, Charles},
  title =	{{Dynamic Membership for Regular Languages}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{116:1--116:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.116},
  URN =		{urn:nbn:de:0030-drops-141850},
  doi =		{10.4230/LIPIcs.ICALP.2021.116},
  annote =	{Keywords: regular language, membership, RAM model, updates, dynamic}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2021.17

Uniform Reliability of Self-Join-Free Conjunctive Queries

Authors: Antoine Amarilli and Benny Kimelfeld

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

The reliability of a Boolean Conjunctive Query (CQ) over a tuple-independent probabilistic database is the probability that the CQ is satisfied when the tuples of the database are sampled one by one, independently, with their associated probability. For queries without self-joins (repeated relation symbols), the data complexity of this problem is fully characterized in a known dichotomy: reliability can be computed in polynomial time for hierarchical queries, and is #P-hard for non-hierarchical queries. Hierarchical queries also characterize the tractability of queries for other tasks: having read-once lineage formulas, supporting insertion/deletion updates to the database in constant time, and having a tractable computation of tuples' Shapley and Banzhaf values. In this work, we investigate a fundamental counting problem for CQs without self-joins: how many sets of facts from the input database satisfy the query? This is equivalent to the uniform case of the query reliability problem, where the probability of every tuple is required to be 1/2. Of course, for hierarchical queries, uniform reliability is in polynomial time, like the reliability problem. However, it is an open question whether being hierarchical is necessary for the uniform reliability problem to be in polynomial time. In fact, the complexity of the problem has been unknown even for the simplest non-hierarchical CQs without self-joins. We solve this open question by showing that uniform reliability is #P-complete for every non-hierarchical CQ without self-joins. Hence, we establish that being hierarchical also characterizes the tractability of unweighted counting of the satisfying tuple subsets. We also consider the generalization to query reliability where all tuples of the same relation have the same probability, and give preliminary results on the complexity of this problem.

Cite as

Antoine Amarilli and Benny Kimelfeld. Uniform Reliability of Self-Join-Free Conjunctive Queries. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 17:1-17:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2021.17,
  author =	{Amarilli, Antoine and Kimelfeld, Benny},
  title =	{{Uniform Reliability of Self-Join-Free Conjunctive Queries}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{17:1--17:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.17},
  URN =		{urn:nbn:de:0030-drops-137252},
  doi =		{10.4230/LIPIcs.ICDT.2021.17},
  annote =	{Keywords: Hierarchical conjunctive queries, query reliability, tuple-independent database, counting problems, #P-hardness}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2020.5

A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs

Authors: Antoine Amarilli and İsmail İlkan Ceylan

Published in: LIPIcs, Volume 155, 23rd International Conference on Database Theory (ICDT 2020)

Abstract

We study the problem of probabilistic query evaluation (PQE) over probabilistic graphs, namely, tuple-independent probabilistic databases (TIDs) on signatures of arity two. Our focus is the class of queries that is closed under homomorphisms, or equivalently, the infinite unions of conjunctive queries, denoted UCQ∞. Our main result states that all unbounded queries in UCQ∞ are #P-hard for PQE. As bounded queries in UCQ∞ are already classified by the dichotomy of Dalvi and Suciu [Dalvi and Suciu, 2012], our results and theirs imply a complete dichotomy on PQE for UCQ∞ queries over probabilistic graphs. This dichotomy covers in particular all fragments in UCQ∞ such as negation-free (disjunctive) Datalog, regular path queries, and a large class of ontology-mediated queries on arity-two signatures. Our result is shown by reducing from counting the valuations of positive partitioned 2-DNF formulae (#PP2DNF) for some queries, or from the source-to-target reliability problem in an undirected graph (#U-ST-CON) for other queries, depending on properties of minimal models.

Cite as

Antoine Amarilli and İsmail İlkan Ceylan. A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs. In 23rd International Conference on Database Theory (ICDT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 155, pp. 5:1-5:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2020.5,
  author =	{Amarilli, Antoine and Ceylan, \.{I}smail \.{I}lkan},
  title =	{{A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs}},
  booktitle =	{23rd International Conference on Database Theory (ICDT 2020)},
  pages =	{5:1--5:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-139-9},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{155},
  editor =	{Lutz, Carsten and Jung, Jean Christoph},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2020.5},
  URN =		{urn:nbn:de:0030-drops-119295},
  doi =		{10.4230/LIPIcs.ICDT.2020.5},
  annote =	{Keywords: Tuple-independent database, #P-hardness, recursive queries, homomorphism-closed queries}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2019.12

An Experimental Study of the Treewidth of Real-World Graph Data

Authors: Silviu Maniu, Pierre Senellart, and Suraj Jog

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)

Abstract

Treewidth is a parameter that measures how tree-like a relational instance is, and whether it can reasonably be decomposed into a tree. Many computation tasks are known to be tractable on databases of small treewidth, but computing the treewidth of a given instance is intractable. This article is the first large-scale experimental study of treewidth and tree decompositions of real-world database instances (25 datasets from 8 different domains, with sizes ranging from a few thousand to a few million vertices). The goal is to determine which data, if any, can benefit of the wealth of algorithms for databases of small treewidth. For each dataset, we obtain upper and lower bound estimations of their treewidth, and study the properties of their tree decompositions. We show in particular that, even when treewidth is high, using partial tree decompositions can result in data structures that can assist algorithms.

Cite as

Silviu Maniu, Pierre Senellart, and Suraj Jog. An Experimental Study of the Treewidth of Real-World Graph Data. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 12:1-12:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{maniu_et_al:LIPIcs.ICDT.2019.12,
  author =	{Maniu, Silviu and Senellart, Pierre and Jog, Suraj},
  title =	{{An Experimental Study of the Treewidth of Real-World Graph Data}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  pages =	{12:1--12:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.12},
  URN =		{urn:nbn:de:0030-drops-103147},
  doi =		{10.4230/LIPIcs.ICDT.2019.12},
  annote =	{Keywords: Treewidth, Graph decompositions, Experiments, Query processing}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2019.18

A Single Approach to Decide Chase Termination on Linear Existential Rules

Authors: Michel Leclère, Marie-Laure Mugnier, Michaël Thomazo, and Federico Ulliana

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)

Abstract

Existential rules, long known as tuple-generating dependencies in database theory, have been intensively studied in the last decade as a powerful formalism to represent ontological knowledge in the context of ontology-based query answering. A knowledge base is then composed of an instance that contains incomplete data and a set of existential rules, and answers to queries are logically entailed from the knowledge base. This brought again to light the fundamental chase tool, and its different variants that have been proposed in the literature. It is well-known that the problem of determining, given a chase variant and a set of existential rules, whether the chase will halt on any instance, is undecidable. Hence, a crucial issue is whether it becomes decidable for known subclasses of existential rules. In this work, we consider linear existential rules with atomic head, a simple yet important subclass of existential rules that generalizes inclusion dependencies. We show the decidability of the all-instance chase termination problem on these rules for three main chase variants, namely semi-oblivious, restricted and core chase. To obtain these results, we introduce a novel approach based on so-called derivation trees and a single notion of forbidden pattern. Besides the theoretical interest of a unified approach and new proofs for the semi-oblivious and core chase variants, we provide the first positive decidability results concerning the termination of the restricted chase, proving that chase termination on linear existential rules with atomic head is decidable for both versions of the problem: Does every chase sequence terminate? Does some chase sequence terminate?

Cite as

Michel Leclère, Marie-Laure Mugnier, Michaël Thomazo, and Federico Ulliana. A Single Approach to Decide Chase Termination on Linear Existential Rules. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 18:1-18:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{leclere_et_al:LIPIcs.ICDT.2019.18,
  author =	{Lecl\`{e}re, Michel and Mugnier, Marie-Laure and Thomazo, Micha\"{e}l and Ulliana, Federico},
  title =	{{A Single Approach to Decide Chase Termination on Linear Existential Rules}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  pages =	{18:1--18:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.18},
  URN =		{urn:nbn:de:0030-drops-103200},
  doi =		{10.4230/LIPIcs.ICDT.2019.18},
  annote =	{Keywords: Chase, Tuple Generating Dependencies, Existential rules, Decidability}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2019.22

Constant-Delay Enumeration for Nondeterministic Document Spanners

Authors: Antoine Amarilli, Pierre Bourhis, Stefan Mengel, and Matthias Niewerth

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)

Abstract

We consider the information extraction framework known as document spanners, and study the problem of efficiently computing the results of the extraction from an input document, where the extraction task is described as a sequential variable-set automaton (VA). We pose this problem in the setting of enumeration algorithms, where we can first run a preprocessing phase and must then produce the results with a small delay between any two consecutive results. Our goal is to have an algorithm which is tractable in combined complexity, i.e., in the sizes of the input document and the VA; while ensuring the best possible data complexity bounds in the input document size, i.e., constant delay in the document size. Several recent works at PODS'18 proposed such algorithms but with linear delay in the document size or with an exponential dependency in size of the (generally nondeterministic) input VA. In particular, Florenzano et al. suggest that our desired runtime guarantees cannot be met for general sequential VAs. We refute this and show that, given a nondeterministic sequential VA and an input document, we can enumerate the mappings of the VA on the document with the following bounds: the preprocessing is linear in the document size and polynomial in the size of the VA, and the delay is independent of the document and polynomial in the size of the VA. The resulting algorithm thus achieves tractability in combined complexity and the best possible data complexity bounds. Moreover, it is rather easy to describe, in particular for the restricted case of so-called extended VAs.

Cite as

Antoine Amarilli, Pierre Bourhis, Stefan Mengel, and Matthias Niewerth. Constant-Delay Enumeration for Nondeterministic Document Spanners. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 22:1-22:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2019.22,
  author =	{Amarilli, Antoine and Bourhis, Pierre and Mengel, Stefan and Niewerth, Matthias},
  title =	{{Constant-Delay Enumeration for Nondeterministic Document Spanners}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  pages =	{22:1--22:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.22},
  URN =		{urn:nbn:de:0030-drops-103246},
  doi =		{10.4230/LIPIcs.ICDT.2019.22},
  annote =	{Keywords: enumeration, spanners, automata}
}

Document

DOI: 10.4230/LIPIcs.ICALP.2018.115

Topological Sorting with Regular Constraints

Authors: Antoine Amarilli and Charles Paperman

Published in: LIPIcs, Volume 107, 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)

Abstract

We introduce the constrained topological sorting problem (CTS): given a regular language K and a directed acyclic graph G with labeled vertices, determine if G has a topological sort that forms a word in K. This natural problem applies to several settings, e.g., scheduling with costs or verifying concurrent programs. We consider the problem CTS[K] where the target language K is fixed, and study its complexity depending on K. We show that CTS[K] is tractable when K falls in several language families, e.g., unions of monomials, which can be used for pattern matching. However, we show that CTS[K] is NP-hard for K = (ab)^* and introduce a shuffle reduction technique to show hardness for more languages. We also study the special case of the constrained shuffle problem (CSh), where the input graph is a disjoint union of strings, and show that CSh[K] is additionally tractable when K is a group language or a union of district group monomials. We conjecture that a dichotomy should hold on the complexity of CTS[K] or CSh[K] depending on K, and substantiate this by proving a coarser dichotomy under a different problem phrasing which ensures that tractable languages are closed under common operators.

Cite as

Antoine Amarilli and Charles Paperman. Topological Sorting with Regular Constraints. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 115:1-115:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICALP.2018.115,
  author =	{Amarilli, Antoine and Paperman, Charles},
  title =	{{Topological Sorting with Regular Constraints}},
  booktitle =	{45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)},
  pages =	{115:1--115:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-076-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{107},
  editor =	{Chatzigiannakis, Ioannis and Kaklamanis, Christos and Marx, D\'{a}niel and Sannella, Donald},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2018.115},
  URN =		{urn:nbn:de:0030-drops-91193},
  doi =		{10.4230/LIPIcs.ICALP.2018.115},
  annote =	{Keywords: Topological sorting, shuffle problem, regular language}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2018.5

Enumeration on Trees under Relabelings

Authors: Antoine Amarilli, Pierre Bourhis, and Stefan Mengel

Published in: LIPIcs, Volume 98, 21st International Conference on Database Theory (ICDT 2018)

Abstract

We study how to evaluate MSO queries with free variables on trees, within the framework of enumeration algorithms. Previous work has shown how to enumerate answers with linear-time preprocessing and delay linear in the size of each output, i.e., constant-delay for free first-order variables. We extend this result to support relabelings, a restricted kind of update operations on trees which allows us to change the node labels. Our main result shows that we can enumerate the answers of MSO queries on trees with linear-time preprocessing and delay linear in each answer, while supporting node relabelings in logarithmic time. To prove this, we reuse the circuit-based enumeration structure from our earlier work, and develop techniques to maintain its index under node relabelings. We also show how enumeration under relabelings can be applied to evaluate practical query languages, such as aggregate, group-by, and parameterized queries.

Cite as

Antoine Amarilli, Pierre Bourhis, and Stefan Mengel. Enumeration on Trees under Relabelings. In 21st International Conference on Database Theory (ICDT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 98, pp. 5:1-5:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2018.5,
  author =	{Amarilli, Antoine and Bourhis, Pierre and Mengel, Stefan},
  title =	{{Enumeration on Trees under Relabelings}},
  booktitle =	{21st International Conference on Database Theory (ICDT 2018)},
  pages =	{5:1--5:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-063-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{98},
  editor =	{Kimelfeld, Benny and Amsterdamer, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2018.5},
  URN =		{urn:nbn:de:0030-drops-86060},
  doi =		{10.4230/LIPIcs.ICDT.2018.5},
  annote =	{Keywords: enumeration, trees, updates, MSO, circuits, knowledge compilation}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2018.6

Connecting Width and Structure in Knowledge Compilation

Authors: Antoine Amarilli, Mikaël Monet, and Pierre Senellart

Published in: LIPIcs, Volume 98, 21st International Conference on Database Theory (ICDT 2018)

Abstract

Several query evaluation tasks can be done via knowledge compilation: the query result is compiled as a lineage circuit from which the answer can be determined. For such tasks, it is important to leverage some width parameters of the circuit, such as bounded treewidth or pathwidth, to convert the circuit to structured classes, e.g., deterministic structured NNFs (d-SDNNFs) or OBDDs. In this work, we show how to connect the width of circuits to the size of their structured representation, through upper and lower bounds. For the upper bound, we show how bounded-treewidth circuits can be converted to a d-SDNNF, in time linear in the circuit size. Our bound, unlike existing results, is constructive and only singly exponential in the treewidth. We show a related lower bound on monotone DNF or CNF formulas, assuming a constant bound on the arity (size of clauses) and degree (number of occurrences of each variable). Specifically, any d-SDNNF (resp., SDNNF) for such a DNF (resp., CNF) must be of exponential size in its treewidth; and the same holds for pathwidth when compiling to OBDDs. Our lower bounds, in contrast with most previous work, apply to any formula of this class, not just a well-chosen family. Hence, for our language of DNF and CNF, pathwidth and treewidth respectively characterize the efficiency of compiling to OBDDs and (d-)SDNNFs, that is, compilation is singly exponential in the width parameter. We conclude by applying our lower bound results to the task of query evaluation.

Cite as

Antoine Amarilli, Mikaël Monet, and Pierre Senellart. Connecting Width and Structure in Knowledge Compilation. In 21st International Conference on Database Theory (ICDT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 98, pp. 6:1-6:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2018.6,
  author =	{Amarilli, Antoine and Monet, Mika\"{e}l and Senellart, Pierre},
  title =	{{Connecting Width and Structure in Knowledge Compilation}},
  booktitle =	{21st International Conference on Database Theory (ICDT 2018)},
  pages =	{6:1--6:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-063-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{98},
  editor =	{Kimelfeld, Benny and Amsterdamer, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2018.6},
  URN =		{urn:nbn:de:0030-drops-86083},
  doi =		{10.4230/LIPIcs.ICDT.2018.6},
  annote =	{Keywords: knowledge compilation, probabilistic databases, treewidth, circuits}
}

19 Search Results for "Amarilli, Antoine"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message