Document

Complete Volume

**Published in:** OASIcs, Volume 119, The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen (2024)

OASIcs, Volume 119, Tannen's Festschrift, Complete Volume

The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen. Open Access Series in Informatics (OASIcs), Volume 119, pp. 1-196, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@Proceedings{amarilli_et_al:OASIcs.Tannen, title = {{OASIcs, Volume 119, Tannen's Festschrift, Complete Volume}}, booktitle = {The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen}, pages = {1--196}, series = {Open Access Series in Informatics (OASIcs)}, ISBN = {978-3-95977-320-1}, ISSN = {2190-6807}, year = {2024}, volume = {119}, editor = {Amarilli, Antoine and Deutsch, Alin}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Tannen}, URN = {urn:nbn:de:0030-drops-200953}, doi = {10.4230/OASIcs.Tannen}, annote = {Keywords: OASIcs, Volume 119, Tannen's Festschrift, Complete Volume} }

Document

Front Matter

**Published in:** OASIcs, Volume 119, The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen (2024)

Front Matter, Table of Contents, Preface, Conference Organization

The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen. Open Access Series in Informatics (OASIcs), Volume 119, pp. 0:i-0:xii, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:OASIcs.Tannen.0, author = {Amarilli, Antoine and Deutsch, Alin}, title = {{Front Matter, Table of Contents, Preface, Conference Organization}}, booktitle = {The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen}, pages = {0:i--0:xii}, series = {Open Access Series in Informatics (OASIcs)}, ISBN = {978-3-95977-320-1}, ISSN = {2190-6807}, year = {2024}, volume = {119}, editor = {Amarilli, Antoine and Deutsch, Alin}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Tannen.0}, URN = {urn:nbn:de:0030-drops-200962}, doi = {10.4230/OASIcs.Tannen.0}, annote = {Keywords: Front Matter, Table of Contents, Preface, Conference Organization} }

Document

**Published in:** LIPIcs, Volume 290, 27th International Conference on Database Theory (ICDT 2024)

When extracting a relation of spans (intervals) from a text document, a common practice is to filter out tuples of the relation that are deemed dominated by others. The domination rule is defined as a partial order that varies along different systems and tasks. For example, we may state that a tuple is dominated by tuples that extend it by assigning additional attributes, or assigning larger intervals. The result of filtering the relation would then be the skyline according to this partial order. As this filtering may remove most of the extracted tuples, we study whether we can improve the performance of the extraction by compiling the domination rule into the extractor.
To this aim, we introduce the skyline operator for declarative information extraction tasks expressed as document spanners. We show that this operator can be expressed via regular operations when the domination partial order can itself be expressed as a regular spanner, which covers several natural domination rules. Yet, we show that the skyline operator incurs a computational cost (under combined complexity). First, there are cases where the operator requires an exponential blowup on the number of states needed to represent the spanner as a sequential variable-set automaton. Second, the evaluation may become computationally hard. Our analysis more precisely identifies classes of domination rules for which the combined complexity is tractable or intractable.

Antoine Amarilli, Benny Kimelfeld, Sébastien Labbé, and Stefan Mengel. Skyline Operators for Document Spanners. In 27th International Conference on Database Theory (ICDT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 290, pp. 7:1-7:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2024.7, author = {Amarilli, Antoine and Kimelfeld, Benny and Labb\'{e}, S\'{e}bastien and Mengel, Stefan}, title = {{Skyline Operators for Document Spanners}}, booktitle = {27th International Conference on Database Theory (ICDT 2024)}, pages = {7:1--7:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-312-6}, ISSN = {1868-8969}, year = {2024}, volume = {290}, editor = {Cormode, Graham and Shekelyan, Michael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2024.7}, URN = {urn:nbn:de:0030-drops-197898}, doi = {10.4230/LIPIcs.ICDT.2024.7}, annote = {Keywords: Information Extraction, Document Spanners, Query Evaluation} }

Document

**Published in:** LIPIcs, Volume 290, 27th International Conference on Database Theory (ICDT 2024)

Query evaluation over probabilistic databases is a notoriously intractable problem - not only in combined complexity, but for many natural queries in data complexity as well [Antoine Amarilli et al., 2017; Nilesh N. Dalvi and Dan Suciu, 2012]. This motivates the study of probabilistic query evaluation through the lens of approximation algorithms, and particularly of combined FPRASes, whose runtime is polynomial in both the query and instance size. In this paper, we focus on tuple-independent probabilistic databases over binary signatures, which can be equivalently viewed as probabilistic graphs. We study in which cases we can devise combined FPRASes for probabilistic query evaluation in this setting.
We settle the complexity of this problem for a variety of query and instance classes, by proving both approximability and (conditional) inapproximability results. This allows us to deduce many corollaries of possible independent interest. For example, we show how the results of [Marcelo Arenas et al., 2021] on counting fixed-length strings accepted by an NFA imply the existence of an FPRAS for the two-terminal network reliability problem on directed acyclic graphs: this was an open problem until now [Rico Zenklusen and Marco Laumanns, 2011]. We also show that one cannot extend a recent result [Timothy van Bremen and Kuldeep S. Meel, 2023] that gives a combined FPRAS for self-join-free conjunctive queries of bounded hypertree width on probabilistic databases: neither the bounded-hypertree-width condition nor the self-join-freeness hypothesis can be relaxed. Finally, we complement all our inapproximability results with unconditional lower bounds, showing that DNNF provenance circuits must have at least moderately exponential size in combined complexity.

Antoine Amarilli, Timothy van Bremen, and Kuldeep S. Meel. Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability. In 27th International Conference on Database Theory (ICDT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 290, pp. 15:1-15:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2024.15, author = {Amarilli, Antoine and van Bremen, Timothy and Meel, Kuldeep S.}, title = {{Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability}}, booktitle = {27th International Conference on Database Theory (ICDT 2024)}, pages = {15:1--15:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-312-6}, ISSN = {1868-8969}, year = {2024}, volume = {290}, editor = {Cormode, Graham and Shekelyan, Michael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2024.15}, URN = {urn:nbn:de:0030-drops-197978}, doi = {10.4230/LIPIcs.ICDT.2024.15}, annote = {Keywords: Probabilistic query evaluation, tuple-independent databases, approximation} }

Document

**Published in:** LIPIcs, Volume 290, 27th International Conference on Database Theory (ICDT 2024)

We study the problem of enumerating the satisfying assignments for certain circuit classes from knowledge compilation, where assignments are ranked in a specific order. In particular, we show how this problem can be used to efficiently perform ranked enumeration of the answers to MSO queries over trees, with the order being given by a ranking function satisfying a subset-monotonicity property.
Assuming that the number of variables is constant, we show that we can enumerate the satisfying assignments in ranked order for so-called multivalued circuits that are smooth, decomposable, and in negation normal form (smooth multivalued DNNF). There is no preprocessing and the enumeration delay is linear in the size of the circuit times the number of values, plus a logarithmic term in the number of assignments produced so far. If we further assume that the circuit is deterministic (smooth multivalued d-DNNF), we can achieve linear-time preprocessing in the circuit, and the delay only features the logarithmic term.

Antoine Amarilli, Pierre Bourhis, Florent Capelli, and Mikaël Monet. Ranked Enumeration for MSO on Trees via Knowledge Compilation. In 27th International Conference on Database Theory (ICDT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 290, pp. 25:1-25:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2024.25, author = {Amarilli, Antoine and Bourhis, Pierre and Capelli, Florent and Monet, Mika\"{e}l}, title = {{Ranked Enumeration for MSO on Trees via Knowledge Compilation}}, booktitle = {27th International Conference on Database Theory (ICDT 2024)}, pages = {25:1--25:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-312-6}, ISSN = {1868-8969}, year = {2024}, volume = {290}, editor = {Cormode, Graham and Shekelyan, Michael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2024.25}, URN = {urn:nbn:de:0030-drops-198079}, doi = {10.4230/LIPIcs.ICDT.2024.25}, annote = {Keywords: Enumeration, knowledge compilation, monadic second-order logic} }

Document

**Published in:** LIPIcs, Volume 255, 26th International Conference on Database Theory (ICDT 2023)

We study the uniform query reliability problem, which asks, for a fixed Boolean query Q, given an instance I, how many subinstances of I satisfy Q. Equivalently, this is a restricted case of Boolean query evaluation on tuple-independent probabilistic databases where all facts must have probability 1/2. We focus on graph signatures, and on queries closed under homomorphisms. We show that for any such query that is unbounded, i.e., not equivalent to a union of conjunctive queries, the uniform reliability problem is #P-hard. This recaptures the hardness, e.g., of s-t connectedness, which counts how many subgraphs of an input graph have a path between a source and a sink.
This new hardness result on uniform reliability strengthens our earlier hardness result on probabilistic query evaluation for unbounded homomorphism-closed queries [Amarilli and Ceylan, 2021]. Indeed, our earlier proof crucially used facts with probability 1, so it did not apply to the unweighted case. The new proof presented in this paper avoids this; it uses our recent hardness result on uniform reliability for non-hierarchical conjunctive queries without self-joins [Antoine Amarilli and Benny Kimelfeld, 2022], along with new techniques.

Antoine Amarilli. Uniform Reliability for Unbounded Homomorphism-Closed Graph Queries. In 26th International Conference on Database Theory (ICDT 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 255, pp. 14:1-14:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{amarilli:LIPIcs.ICDT.2023.14, author = {Amarilli, Antoine}, title = {{Uniform Reliability for Unbounded Homomorphism-Closed Graph Queries}}, booktitle = {26th International Conference on Database Theory (ICDT 2023)}, pages = {14:1--14:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-270-9}, ISSN = {1868-8969}, year = {2023}, volume = {255}, editor = {Geerts, Floris and Vandevoort, Brecht}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2023.14}, URN = {urn:nbn:de:0030-drops-177566}, doi = {10.4230/LIPIcs.ICDT.2023.14}, annote = {Keywords: Uniform reliability, #P-hardness, probabilistic databases} }

Document

**Published in:** LIPIcs, Volume 254, 40th International Symposium on Theoretical Aspects of Computer Science (STACS 2023)

We study the task, for a given language L, of enumerating the (generally infinite) sequence of its words, without repetitions, while bounding the delay between two consecutive words. To allow for delay bounds that do not depend on the current word length, we assume a model where we produce each word by editing the preceding word with a small edit script, rather than writing out the word from scratch. In particular, this witnesses that the language is orderable, i.e., we can write its words as an infinite sequence such that the Levenshtein edit distance between any two consecutive words is bounded by a value that depends only on the language. For instance, (a+b)^* is orderable (with a variant of the Gray code), but a^* + b^* is not.
We characterize which regular languages are enumerable in this sense, and show that this can be decided in PTIME in an input deterministic finite automaton (DFA) for the language. In fact, we show that, given a DFA A, we can compute in PTIME automata A₁, …, A_t such that L(A) is partitioned as L(A₁) ⊔ … ⊔ L(A_t) and every L(A_i) is orderable in this sense. Further, we show that the value of t obtained is optimal, i.e., we cannot partition L(A) into less than t orderable languages.
In the case where L(A) is orderable (i.e., t = 1), we show that the ordering can be produced by a bounded-delay algorithm: specifically, the algorithm runs in a suitable pointer machine model, and produces a sequence of bounded-length edit scripts to visit the words of L(A) without repetitions, with bounded delay - exponential in |A| - between each script. In fact, we show that we can achieve this while only allowing the edit operations push and pop at the beginning and end of the word, which implies that the word can in fact be maintained in a double-ended queue.
By contrast, when fixing the distance bound d between consecutive words and the number of classes of the partition, it is NP-hard in the input DFA A to decide if L(A) is orderable in this sense, already for finite languages.
Last, we study the model where push-pop edits are only allowed at the end of the word, corresponding to a case where the word is maintained on a stack. We show that these operations are strictly weaker and that the slender languages are precisely those that can be partitioned into finitely many languages that are orderable in this sense. For the slender languages, we can again characterize the minimal number of languages in the partition, and achieve bounded-delay enumeration.

Antoine Amarilli and Mikaël Monet. Enumerating Regular Languages with Bounded Delay. In 40th International Symposium on Theoretical Aspects of Computer Science (STACS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 254, pp. 8:1-8:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.STACS.2023.8, author = {Amarilli, Antoine and Monet, Mika\"{e}l}, title = {{Enumerating Regular Languages with Bounded Delay}}, booktitle = {40th International Symposium on Theoretical Aspects of Computer Science (STACS 2023)}, pages = {8:1--8:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-266-2}, ISSN = {1868-8969}, year = {2023}, volume = {254}, editor = {Berenbrink, Petra and Bouyer, Patricia and Dawar, Anuj and Kant\'{e}, Mamadou Moustapha}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2023.8}, URN = {urn:nbn:de:0030-drops-176609}, doi = {10.4230/LIPIcs.STACS.2023.8}, annote = {Keywords: Regular language, constant-delay enumeration, edit distance} }

Document

**Published in:** LIPIcs, Volume 241, 47th International Symposium on Mathematical Foundations of Computer Science (MFCS 2022)

We consider a weighted counting problem on matchings, denoted PrMatching(𝒢), on an arbitrary fixed graph family 𝒢. The input consists of a graph G ∈ 𝒢 and of rational probabilities of existence on every edge of G, assuming independence. The output is the probability of obtaining a matching of G in the resulting distribution, i.e., a set of edges that are pairwise disjoint. It is known that, if 𝒢 has bounded treewidth, then PrMatching(𝒢) can be solved in polynomial time. In this paper we show that, under some assumptions, bounded treewidth in fact characterizes the tractable graph families for this problem. More precisely, we show intractability for all graph families 𝒢 satisfying the following treewidth-constructibility requirement: given an integer k in unary, we can construct in polynomial time a graph G ∈ 𝒢 with treewidth at least k. Our hardness result is then the following: for any treewidth-constructible graph family 𝒢, the problem PrMatching(𝒢) is intractable. This generalizes known hardness results for weighted matching counting under some restrictions that do not bound treewidth, e.g., being planar, 3-regular, or bipartite; it also answers a question left open in [Amarilli et al., 2016]. We also obtain a similar lower bound for the weighted counting of edge covers.

Antoine Amarilli and Mikaël Monet. Weighted Counting of Matchings in Unbounded-Treewidth Graph Families. In 47th International Symposium on Mathematical Foundations of Computer Science (MFCS 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 241, pp. 9:1-9:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.MFCS.2022.9, author = {Amarilli, Antoine and Monet, Mika\"{e}l}, title = {{Weighted Counting of Matchings in Unbounded-Treewidth Graph Families}}, booktitle = {47th International Symposium on Mathematical Foundations of Computer Science (MFCS 2022)}, pages = {9:1--9:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-256-3}, ISSN = {1868-8969}, year = {2022}, volume = {241}, editor = {Szeider, Stefan and Ganian, Robert and Silva, Alexandra}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2022.9}, URN = {urn:nbn:de:0030-drops-168078}, doi = {10.4230/LIPIcs.MFCS.2022.9}, annote = {Keywords: Treewidth, counting complexity, matchings, Fibonacci sequence} }

Document

Track B: Automata, Logic, Semantics, and Theory of Programming

**Published in:** LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)

We study the dynamic membership problem for regular languages: fix a language L, read a word w, build in time O(|w|) a data structure indicating if w is in L, and maintain this structure efficiently under letter substitutions on w. We consider this problem on the unit cost RAM model with logarithmic word length, where the problem always has a solution in O(log|w| / log log|w|) per operation.
We show that the problem is in O(log log|w|) for languages in an algebraically-defined, decidable class QSG, and that it is in O(1) for another such class QLZG. We show that languages not in QSG admit a reduction from the prefix problem for a cyclic group, so that they require Ω(log|w| /log log|w|) operations in the worst case; and that QSG languages not in QLZG admit a reduction from the prefix problem for the multiplicative monoid U₁ = {0, 1}, which we conjecture cannot be maintained in O(1). This yields a conditional trichotomy. We also investigate intermediate cases between O(1) and O(log log|w|).
Our results are shown via the dynamic word problem for monoids and semigroups, for which we also give a classification. We thus solve open problems of the paper of Skovbjerg Frandsen, Miltersen, and Skyum [Skovbjerg Frandsen et al., 1997] on the dynamic word problem, and additionally cover regular languages.

Antoine Amarilli, Louis Jachiet, and Charles Paperman. Dynamic Membership for Regular Languages. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 116:1-116:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICALP.2021.116, author = {Amarilli, Antoine and Jachiet, Louis and Paperman, Charles}, title = {{Dynamic Membership for Regular Languages}}, booktitle = {48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)}, pages = {116:1--116:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-195-5}, ISSN = {1868-8969}, year = {2021}, volume = {198}, editor = {Bansal, Nikhil and Merelli, Emanuela and Worrell, James}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.116}, URN = {urn:nbn:de:0030-drops-141850}, doi = {10.4230/LIPIcs.ICALP.2021.116}, annote = {Keywords: regular language, membership, RAM model, updates, dynamic} }

Document

**Published in:** LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

The reliability of a Boolean Conjunctive Query (CQ) over a tuple-independent probabilistic database is the probability that the CQ is satisfied when the tuples of the database are sampled one by one, independently, with their associated probability. For queries without self-joins (repeated relation symbols), the data complexity of this problem is fully characterized in a known dichotomy: reliability can be computed in polynomial time for hierarchical queries, and is #P-hard for non-hierarchical queries. Hierarchical queries also characterize the tractability of queries for other tasks: having read-once lineage formulas, supporting insertion/deletion updates to the database in constant time, and having a tractable computation of tuples' Shapley and Banzhaf values.
In this work, we investigate a fundamental counting problem for CQs without self-joins: how many sets of facts from the input database satisfy the query? This is equivalent to the uniform case of the query reliability problem, where the probability of every tuple is required to be 1/2. Of course, for hierarchical queries, uniform reliability is in polynomial time, like the reliability problem. However, it is an open question whether being hierarchical is necessary for the uniform reliability problem to be in polynomial time. In fact, the complexity of the problem has been unknown even for the simplest non-hierarchical CQs without self-joins.
We solve this open question by showing that uniform reliability is #P-complete for every non-hierarchical CQ without self-joins. Hence, we establish that being hierarchical also characterizes the tractability of unweighted counting of the satisfying tuple subsets. We also consider the generalization to query reliability where all tuples of the same relation have the same probability, and give preliminary results on the complexity of this problem.

Antoine Amarilli and Benny Kimelfeld. Uniform Reliability of Self-Join-Free Conjunctive Queries. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 17:1-17:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2021.17, author = {Amarilli, Antoine and Kimelfeld, Benny}, title = {{Uniform Reliability of Self-Join-Free Conjunctive Queries}}, booktitle = {24th International Conference on Database Theory (ICDT 2021)}, pages = {17:1--17:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-179-5}, ISSN = {1868-8969}, year = {2021}, volume = {186}, editor = {Yi, Ke and Wei, Zhewei}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.17}, URN = {urn:nbn:de:0030-drops-137252}, doi = {10.4230/LIPIcs.ICDT.2021.17}, annote = {Keywords: Hierarchical conjunctive queries, query reliability, tuple-independent database, counting problems, #P-hardness} }

Document

**Published in:** LIPIcs, Volume 155, 23rd International Conference on Database Theory (ICDT 2020)

We study the problem of probabilistic query evaluation (PQE) over probabilistic graphs, namely, tuple-independent probabilistic databases (TIDs) on signatures of arity two. Our focus is the class of queries that is closed under homomorphisms, or equivalently, the infinite unions of conjunctive queries, denoted UCQ∞. Our main result states that all unbounded queries in UCQ∞ are #P-hard for PQE. As bounded queries in UCQ∞ are already classified by the dichotomy of Dalvi and Suciu [Dalvi and Suciu, 2012], our results and theirs imply a complete dichotomy on PQE for UCQ∞ queries over probabilistic graphs. This dichotomy covers in particular all fragments in UCQ∞ such as negation-free (disjunctive) Datalog, regular path queries, and a large class of ontology-mediated queries on arity-two signatures. Our result is shown by reducing from counting the valuations of positive partitioned 2-DNF formulae (#PP2DNF) for some queries, or from the source-to-target reliability problem in an undirected graph (#U-ST-CON) for other queries, depending on properties of minimal models.

Antoine Amarilli and İsmail İlkan Ceylan. A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs. In 23rd International Conference on Database Theory (ICDT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 155, pp. 5:1-5:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2020.5, author = {Amarilli, Antoine and Ceylan, \.{I}smail \.{I}lkan}, title = {{A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs}}, booktitle = {23rd International Conference on Database Theory (ICDT 2020)}, pages = {5:1--5:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-139-9}, ISSN = {1868-8969}, year = {2020}, volume = {155}, editor = {Lutz, Carsten and Jung, Jean Christoph}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2020.5}, URN = {urn:nbn:de:0030-drops-119295}, doi = {10.4230/LIPIcs.ICDT.2020.5}, annote = {Keywords: Tuple-independent database, #P-hardness, recursive queries, homomorphism-closed queries} }

Document

**Published in:** LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)

We consider the information extraction framework known as document spanners, and study the problem of efficiently computing the results of the extraction from an input document, where the extraction task is described as a sequential variable-set automaton (VA). We pose this problem in the setting of enumeration algorithms, where we can first run a preprocessing phase and must then produce the results with a small delay between any two consecutive results. Our goal is to have an algorithm which is tractable in combined complexity, i.e., in the sizes of the input document and the VA; while ensuring the best possible data complexity bounds in the input document size, i.e., constant delay in the document size. Several recent works at PODS'18 proposed such algorithms but with linear delay in the document size or with an exponential dependency in size of the (generally nondeterministic) input VA. In particular, Florenzano et al. suggest that our desired runtime guarantees cannot be met for general sequential VAs. We refute this and show that, given a nondeterministic sequential VA and an input document, we can enumerate the mappings of the VA on the document with the following bounds: the preprocessing is linear in the document size and polynomial in the size of the VA, and the delay is independent of the document and polynomial in the size of the VA. The resulting algorithm thus achieves tractability in combined complexity and the best possible data complexity bounds. Moreover, it is rather easy to describe, in particular for the restricted case of so-called extended VAs.

Antoine Amarilli, Pierre Bourhis, Stefan Mengel, and Matthias Niewerth. Constant-Delay Enumeration for Nondeterministic Document Spanners. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 22:1-22:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2019.22, author = {Amarilli, Antoine and Bourhis, Pierre and Mengel, Stefan and Niewerth, Matthias}, title = {{Constant-Delay Enumeration for Nondeterministic Document Spanners}}, booktitle = {22nd International Conference on Database Theory (ICDT 2019)}, pages = {22:1--22:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-101-6}, ISSN = {1868-8969}, year = {2019}, volume = {127}, editor = {Barcelo, Pablo and Calautti, Marco}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.22}, URN = {urn:nbn:de:0030-drops-103246}, doi = {10.4230/LIPIcs.ICDT.2019.22}, annote = {Keywords: enumeration, spanners, automata} }

Document

**Published in:** LIPIcs, Volume 107, 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)

We introduce the constrained topological sorting problem (CTS): given a regular language K and a directed acyclic graph G with labeled vertices, determine if G has a topological sort that forms a word in K. This natural problem applies to several settings, e.g., scheduling with costs or verifying concurrent programs. We consider the problem CTS[K] where the target language K is fixed, and study its complexity depending on K. We show that CTS[K] is tractable when K falls in several language families, e.g., unions of monomials, which can be used for pattern matching. However, we show that CTS[K] is NP-hard for K = (ab)^* and introduce a shuffle reduction technique to show hardness for more languages. We also study the special case of the constrained shuffle problem (CSh), where the input graph is a disjoint union of strings, and show that CSh[K] is additionally tractable when K is a group language or a union of district group monomials. We conjecture that a dichotomy should hold on the complexity of CTS[K] or CSh[K] depending on K, and substantiate this by proving a coarser dichotomy under a different problem phrasing which ensures that tractable languages are closed under common operators.

Antoine Amarilli and Charles Paperman. Topological Sorting with Regular Constraints. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 115:1-115:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICALP.2018.115, author = {Amarilli, Antoine and Paperman, Charles}, title = {{Topological Sorting with Regular Constraints}}, booktitle = {45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)}, pages = {115:1--115:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-076-7}, ISSN = {1868-8969}, year = {2018}, volume = {107}, editor = {Chatzigiannakis, Ioannis and Kaklamanis, Christos and Marx, D\'{a}niel and Sannella, Donald}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2018.115}, URN = {urn:nbn:de:0030-drops-91193}, doi = {10.4230/LIPIcs.ICALP.2018.115}, annote = {Keywords: Topological sorting, shuffle problem, regular language} }

Document

**Published in:** LIPIcs, Volume 98, 21st International Conference on Database Theory (ICDT 2018)

We study how to evaluate MSO queries with free variables on trees, within the
framework of enumeration algorithms. Previous work has shown how to enumerate
answers with linear-time preprocessing and delay linear in the size of each
output, i.e., constant-delay for free first-order variables. We extend this
result to support relabelings, a restricted kind of update operations on
trees which allows us to change the node labels. Our main result shows that we
can enumerate the answers of MSO queries on trees with linear-time preprocessing
and delay linear in each answer, while supporting node relabelings in logarithmic time. To
prove this, we reuse the circuit-based enumeration structure from our earlier
work, and develop techniques to maintain its index under node relabelings. We
also show how enumeration under relabelings can be applied to evaluate practical
query languages, such as aggregate, group-by, and parameterized queries.

Antoine Amarilli, Pierre Bourhis, and Stefan Mengel. Enumeration on Trees under Relabelings. In 21st International Conference on Database Theory (ICDT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 98, pp. 5:1-5:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2018.5, author = {Amarilli, Antoine and Bourhis, Pierre and Mengel, Stefan}, title = {{Enumeration on Trees under Relabelings}}, booktitle = {21st International Conference on Database Theory (ICDT 2018)}, pages = {5:1--5:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-063-7}, ISSN = {1868-8969}, year = {2018}, volume = {98}, editor = {Kimelfeld, Benny and Amsterdamer, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2018.5}, URN = {urn:nbn:de:0030-drops-86060}, doi = {10.4230/LIPIcs.ICDT.2018.5}, annote = {Keywords: enumeration, trees, updates, MSO, circuits, knowledge compilation} }

Document

**Published in:** LIPIcs, Volume 98, 21st International Conference on Database Theory (ICDT 2018)

Several query evaluation tasks can be done via knowledge compilation: the query result is compiled as a lineage circuit from which the answer can be determined. For such tasks, it is important to leverage some width parameters of the circuit, such as bounded treewidth or pathwidth, to convert the circuit to structured classes, e.g., deterministic structured NNFs (d-SDNNFs) or OBDDs. In this work, we show how to connect the width of circuits to the size of their structured representation, through upper and lower bounds. For the upper bound, we show how bounded-treewidth circuits can be converted to a d-SDNNF, in time linear in the circuit size. Our bound, unlike existing results, is constructive and only singly exponential in the treewidth. We show a related lower bound on monotone DNF or CNF formulas, assuming a constant bound on the arity (size of clauses) and degree (number of occurrences of each variable). Specifically, any d-SDNNF (resp., SDNNF) for such a DNF (resp., CNF) must be of exponential size in its treewidth; and the same holds for pathwidth when compiling to OBDDs. Our lower bounds, in contrast with most previous work, apply to any formula of this class, not just a well-chosen family. Hence, for our language of DNF and CNF, pathwidth and treewidth respectively characterize the efficiency of compiling to OBDDs and (d-)SDNNFs, that is, compilation is singly exponential in the width parameter. We conclude by applying our lower bound results to the task of query evaluation.

Antoine Amarilli, Mikaël Monet, and Pierre Senellart. Connecting Width and Structure in Knowledge Compilation. In 21st International Conference on Database Theory (ICDT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 98, pp. 6:1-6:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2018.6, author = {Amarilli, Antoine and Monet, Mika\"{e}l and Senellart, Pierre}, title = {{Connecting Width and Structure in Knowledge Compilation}}, booktitle = {21st International Conference on Database Theory (ICDT 2018)}, pages = {6:1--6:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-063-7}, ISSN = {1868-8969}, year = {2018}, volume = {98}, editor = {Kimelfeld, Benny and Amsterdamer, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2018.6}, URN = {urn:nbn:de:0030-drops-86083}, doi = {10.4230/LIPIcs.ICDT.2018.6}, annote = {Keywords: knowledge compilation, probabilistic databases, treewidth, circuits} }

Document

**Published in:** LIPIcs, Volume 90, 24th International Symposium on Temporal Representation and Reasoning (TIME 2017)

To combine and query ordered data from multiple sources, one needs to handle uncertainty about the possible orderings. Examples of such "order-incomplete" data include integrated event sequences such as log entries; lists of properties (e.g., hotels and restaurants) ranked by an unknown function reflecting relevance or customer ratings; and documents edited concurrently with an uncertain order on edits. This paper introduces a query language for order-incomplete data, based on the positive relational algebra with order-aware accumulation. We use partial orders to represent order-incomplete data, and study possible and certain answers for queries in this context. We show that these problems are respectively NP-complete and coNP-complete, but identify many tractable cases depending on the query operators or input partial orders.

Antoine Amarilli, Mouhamadou Lamine Ba, Daniel Deutch, and Pierre Senellart. Possible and Certain Answers for Queries over Order-Incomplete Data. In 24th International Symposium on Temporal Representation and Reasoning (TIME 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 90, pp. 4:1-4:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.TIME.2017.4, author = {Amarilli, Antoine and Ba, Mouhamadou Lamine and Deutch, Daniel and Senellart, Pierre}, title = {{Possible and Certain Answers for Queries over Order-Incomplete Data}}, booktitle = {24th International Symposium on Temporal Representation and Reasoning (TIME 2017)}, pages = {4:1--4:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-052-1}, ISSN = {1868-8969}, year = {2017}, volume = {90}, editor = {Schewe, Sven and Schneider, Thomas and Wijsen, Jef}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.TIME.2017.4}, URN = {urn:nbn:de:0030-drops-79311}, doi = {10.4230/LIPIcs.TIME.2017.4}, annote = {Keywords: certain answer, possible answer, partial order, uncertain data} }

Document

**Published in:** LIPIcs, Volume 80, 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)

We study the problem of enumerating the satisfying valuations of a circuit while bounding the delay, i.e., the time needed to compute each successive valuation. We focus on the class of structured d-DNNF circuits originally introduced in knowledge compilation, a sub-area of artificial intelligence. We propose an algorithm for these circuits that enumerates valuations with linear preprocessing and delay linear in the Hamming weight of each valuation. Moreover, valuations of constant Hamming weight can be enumerated with linear preprocessing and constant delay.
Our results yield a framework for efficient enumeration that applies to all problems whose solutions can be compiled to structured d-DNNFs. In particular, we use it to recapture classical results in database theory, for factorized database representations and for MSO evaluation. This gives an independent proof of constant-delay enumeration for MSO formulae with first-order free variables on bounded-treewidth structures.

Antoine Amarilli, Pierre Bourhis, Louis Jachiet, and Stefan Mengel. A Circuit-Based Approach to Efficient Enumeration. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 80, pp. 111:1-111:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICALP.2017.111, author = {Amarilli, Antoine and Bourhis, Pierre and Jachiet, Louis and Mengel, Stefan}, title = {{A Circuit-Based Approach to Efficient Enumeration}}, booktitle = {44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)}, pages = {111:1--111:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-041-5}, ISSN = {1868-8969}, year = {2017}, volume = {80}, editor = {Chatzigiannakis, Ioannis and Indyk, Piotr and Kuhn, Fabian and Muscholl, Anca}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2017.111}, URN = {urn:nbn:de:0030-drops-74626}, doi = {10.4230/LIPIcs.ICALP.2017.111}, annote = {Keywords: circuits, constant-delay, enumeration, d-DNNFs, MSO} }

Document

**Published in:** LIPIcs, Volume 68, 20th International Conference on Database Theory (ICDT 2017)

Many practical scenarios make it necessary to evaluate top-k queries over data items with partially unknown values. This paper considers a setting where the values are taken from a numerical domain, and where some partial order constraints are given over known and unknown values: under these constraints, we assume that all possible worlds are equally likely.
Our work is the first to propose a principled scheme to derive the value distributions and expected values of unknown items in this setting, with the goal of computing estimated top-k results by interpolating the unknown values from the known ones. We study the complexity of this general task, and show tight complexity bounds, proving that the problem is intractable, but
can be tractably approximated. We then consider the case of tree-shaped partial orders, where we show a constructive PTIME solution. We also compare our problem setting to other top-k definitions on uncertain data.

Antoine Amarilli, Yael Amsterdamer, Tova Milo, and Pierre Senellart. Top-k Querying of Unknown Values under Order Constraints. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 5:1-5:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2017.5, author = {Amarilli, Antoine and Amsterdamer, Yael and Milo, Tova and Senellart, Pierre}, title = {{Top-k Querying of Unknown Values under Order Constraints}}, booktitle = {20th International Conference on Database Theory (ICDT 2017)}, pages = {5:1--5:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-024-8}, ISSN = {1868-8969}, year = {2017}, volume = {68}, editor = {Benedikt, Michael and Orsi, Giorgio}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.5}, URN = {urn:nbn:de:0030-drops-70457}, doi = {10.4230/LIPIcs.ICDT.2017.5}, annote = {Keywords: uncertainty, partial order, unknown values, crowdsourcing, interpolation} }

Document

**Published in:** LIPIcs, Volume 68, 20th International Conference on Database Theory (ICDT 2017)

We investigate parameterizations of both database instances and queries that make query evaluation fixed-parameter tractable in combined complexity. We introduce a new Datalog fragment with stratified negation, intensional-clique-guarded Datalog (ICG-Datalog), with linear-time evaluation on structures of bounded treewidth for programs of bounded rule size. Such programs capture in particular conjunctive queries with simplicial decompositions of bounded width, guarded negation fragment queries of bounded CQ-rank, or two-way regular path queries. Our result is shown by compiling to alternating two-way automata, whose semantics is defined via cyclic provenance circuits (cycluits) that can be tractably evaluated. Last, we prove that probabilistic query evaluation remains intractable in combined complexity under this parameterization.

Antoine Amarilli, Pierre Bourhis, Mikaël Monet, and Pierre Senellart. Combined Tractability of Query Evaluation via Tree Automata and Cycluits. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 6:1-6:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2017.6, author = {Amarilli, Antoine and Bourhis, Pierre and Monet, Mika\"{e}l and Senellart, Pierre}, title = {{Combined Tractability of Query Evaluation via Tree Automata and Cycluits}}, booktitle = {20th International Conference on Database Theory (ICDT 2017)}, pages = {6:1--6:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-024-8}, ISSN = {1868-8969}, year = {2017}, volume = {68}, editor = {Benedikt, Michael and Orsi, Giorgio}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.6}, URN = {urn:nbn:de:0030-drops-70516}, doi = {10.4230/LIPIcs.ICDT.2017.6}, annote = {Keywords: query evaluation, tree automata, provenance, treewidth, circuits} }