LIPIcs, Volume 68

20th International Conference on Database Theory (ICDT 2017)



Thumbnail PDF

Event

ICDT 2017, March 21-24, 2017, Venice, Italy

Editors

Michael Benedikt
Giorgio Orsi

Publication Details

  • published at: 2017-03-17
  • Publisher: Schloss Dagstuhl – Leibniz-Zentrum für Informatik
  • ISBN: 978-3-95977-024-8
  • DBLP: db/conf/icdt/icdt2017

Access Numbers

Documents

No documents found matching your filter selection.
Document
Complete Volume
LIPIcs, Volume 68, ICDT'17, Complete Volume

Authors: Michael Benedikt and Giorgio Orsi


Abstract
LIPIcs, Volume 68, ICDT'17, Complete Volume

Cite as

20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@Proceedings{benedikt_et_al:LIPIcs.ICDT.2017,
  title =	{{LIPIcs, Volume 68, ICDT'17, Complete Volume}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017},
  URN =		{urn:nbn:de:0030-drops-70689},
  doi =		{10.4230/LIPIcs.ICDT.2017},
  annote =	{Keywords: Database Management, Normal Forms, Schema and Subschema, Query Languages, \lbrackSystems\rbrack Query Processing, Relational Databases, Distributed Databases}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization, List of Authors

Authors: Michael Benedikt and Giorgio Orsi


Abstract
Front Matter, Table of Contents, Preface, Conference Organization, List of Authors

Cite as

20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 0:i-0:xii, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{benedikt_et_al:LIPIcs.ICDT.2017.0,
  author =	{Benedikt, Michael and Orsi, Giorgio},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization, List of Authors}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{0:i--0:xii},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.0},
  URN =		{urn:nbn:de:0030-drops-70446},
  doi =		{10.4230/LIPIcs.ICDT.2017.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization, List of Authors}
}
Document
Invited Talk
Rewritability in Monadic Disjunctive Datalog, MMSNP, and Expressive Description Logics (Invited Talk)

Authors: Cristina Feier, Antti Kuusisto, and Carsten Lutz


Abstract
We study rewritability of monadic disjunctive Datalog programs, (the complements of) MMSNP sentences, and ontology-mediated queries (OMQs) based on expressive description logics of the ALC family and on conjunctive queries. We show that rewritability into FO and into monadic Datalog (MDLog) are decidable, and that rewritability into Datalog is decidable when the original query satisfies a certain condition related to equality. We establish 2NExpTime-completeness for all studied problems except rewritability into MDLog for which there remains a gap between 2NExpTime and 3ExpTime. We also analyze the shape of rewritings, which in the MMSNP case correspond to obstructions, and give a new construction of canonical Datalog programs that is more elementary than existing ones and also applies to non-Boolean queries.

Cite as

Cristina Feier, Antti Kuusisto, and Carsten Lutz. Rewritability in Monadic Disjunctive Datalog, MMSNP, and Expressive Description Logics (Invited Talk). In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 1:1-1:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{feier_et_al:LIPIcs.ICDT.2017.1,
  author =	{Feier, Cristina and Kuusisto, Antti and Lutz, Carsten},
  title =	{{Rewritability in Monadic Disjunctive Datalog, MMSNP, and Expressive Description Logics}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{1:1--1:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.1},
  URN =		{urn:nbn:de:0030-drops-70636},
  doi =		{10.4230/LIPIcs.ICDT.2017.1},
  annote =	{Keywords: FO-Rewritability, MDDLog, MMSNP, DL, ontology mediated queries}
}
Document
Invited Talk
Graphs, Hypergraphs, and the Complexity of Conjunctive Database Queries (Invited Talk)

Authors: Dániel Marx


Abstract
The complexity of evaluating conjunctive queries can depend significantly on the structure of the query. For example, it is well known that various notions of acyclicity can make the evaluation problem tractable. More generally, it seems that the complexity is connected to the "treelikeness" of the graph or hypergraph describing the query structure. In the lecture, we will review some of the notions of treelikeness that were proposed in the literature and how they are relevant for the complexity of evaluating conjunctive queries and related problems.

Cite as

Dániel Marx. Graphs, Hypergraphs, and the Complexity of Conjunctive Database Queries (Invited Talk). In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, p. 2:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{marx:LIPIcs.ICDT.2017.2,
  author =	{Marx, D\'{a}niel},
  title =	{{Graphs, Hypergraphs, and the Complexity of Conjunctive Database Queries}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{2:1--2:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.2},
  URN =		{urn:nbn:de:0030-drops-70652},
  doi =		{10.4230/LIPIcs.ICDT.2017.2},
  annote =	{Keywords: Conjunctive queries, treewidth, complexity}
}
Document
Invited Talk
The Smart Crowd - Learning from the Ones Who Know (Invited Talk)

Authors: Tova Milo


Abstract
One of the foremost challenges for information technology over the last few years has been to explore, understand, and extract useful information from large amounts of data. Some particular tasks such as annotating data or matching entities have been outsourced to human workers for many years. But the last few years have seen the rise of a new research field called crowdsourcing that aims at delegating a wide range of tasks to human workers, building formal frameworks, and improving the efficiency of these processes. In order to provide sound scientific foundations for crowdsourcing and support the development of efficient crowd sourcing processes, adequate formal models and algorithms must be defined. In particular, the models must formalize unique characteristics of crowd-based settings, such as the knowledge of the crowd and crowd-provided data; the interaction with crowd members; the inherent inaccuracies and disagreements in crowd answers; and evaluation metrics that capture the cost and effort of the crowd. Clearly, what may be achieved with the help of the crowd depends heavily on the properties and knowledge of the given crowd. In this talk we will focus on knowledgeable crowds. We will examine the use of such crowds, and in particular domain experts, for assisting solving data management problems. Specifically we will consider three dimensions of the problem: (1) How domain experts can help in improving the data itself, e.g. by gathering missing data and improving the quality of existing data, (2) How they can assist in gathering meta-data that facilitate improved data processing, and (3) How can we find and identify the most relevant crowd for a given data management task.

Cite as

Tova Milo. The Smart Crowd - Learning from the Ones Who Know (Invited Talk). In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, p. 3:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{milo:LIPIcs.ICDT.2017.3,
  author =	{Milo, Tova},
  title =	{{The Smart Crowd - Learning from the Ones Who Know}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{3:1--3:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.3},
  URN =		{urn:nbn:de:0030-drops-70643},
  doi =		{10.4230/LIPIcs.ICDT.2017.3},
  annote =	{Keywords: data management, crowdsourcing}
}
Document
GYM: A Multiround Distributed Join Algorithm

Authors: Foto N. Afrati, Manas R. Joglekar, Christopher M. Re, Semih Salihoglu, and Jeffrey D. Ullman


Abstract
Multiround algorithms are now commonly used in distributed data processing systems, yet the extent to which algorithms can benefit from running more rounds is not well understood. This paper answers this question for several rounds for the problem of computing the equijoin of n relations. Given any query Q with width w, intersection width iw, input size IN, output size OUT, and a cluster of machines with M=\Omega(IN \frac{1}{\epsilon}) memory available per machine, where \epsilon > 1 and w \ge 1 are constants, we show that: 1. Q can be computed in O(n) rounds with O(n(INw + OUT)2/M) communication cost with high probability. Q can be computed in O(log(n)) rounds with O(n(INmax(w, 3iw) + OUT)2/M) communication cost with high probability. Intersection width is a new notion we introduce for queries and generalized hypertree decompositions (GHDs) of queries that captures how connected the adjacent components of the GHDs are. We achieve our first result by introducing a distributed and generalized version of Yannakakis's algorithm, called GYM. GYM takes as input any GHD of Q with width w and depth d, and computes Q in O(d + log(n)) rounds and O(n (INw + OUT)2/M) communication cost. We achieve our second result by showing how to construct GHDs of Q with width max(w, 3iw) and depth O(log(n)). We describe another technique to construct GHDs with longer widths and lower depths, demonstrating other tradeoffs one can make between communication and the number of rounds.

Cite as

Foto N. Afrati, Manas R. Joglekar, Christopher M. Re, Semih Salihoglu, and Jeffrey D. Ullman. GYM: A Multiround Distributed Join Algorithm. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 4:1-4:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{afrati_et_al:LIPIcs.ICDT.2017.4,
  author =	{Afrati, Foto N. and Joglekar, Manas R. and Re, Christopher M. and Salihoglu, Semih and Ullman, Jeffrey D.},
  title =	{{GYM: A Multiround Distributed Join Algorithm}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{4:1--4:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.4},
  URN =		{urn:nbn:de:0030-drops-70462},
  doi =		{10.4230/LIPIcs.ICDT.2017.4},
  annote =	{Keywords: Joins, Yannakakis, Bulk Synchronous Processing, GHDs}
}
Document
Top-k Querying of Unknown Values under Order Constraints

Authors: Antoine Amarilli, Yael Amsterdamer, Tova Milo, and Pierre Senellart


Abstract
Many practical scenarios make it necessary to evaluate top-k queries over data items with partially unknown values. This paper considers a setting where the values are taken from a numerical domain, and where some partial order constraints are given over known and unknown values: under these constraints, we assume that all possible worlds are equally likely. Our work is the first to propose a principled scheme to derive the value distributions and expected values of unknown items in this setting, with the goal of computing estimated top-k results by interpolating the unknown values from the known ones. We study the complexity of this general task, and show tight complexity bounds, proving that the problem is intractable, but can be tractably approximated. We then consider the case of tree-shaped partial orders, where we show a constructive PTIME solution. We also compare our problem setting to other top-k definitions on uncertain data.

Cite as

Antoine Amarilli, Yael Amsterdamer, Tova Milo, and Pierre Senellart. Top-k Querying of Unknown Values under Order Constraints. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 5:1-5:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2017.5,
  author =	{Amarilli, Antoine and Amsterdamer, Yael and Milo, Tova and Senellart, Pierre},
  title =	{{Top-k Querying of Unknown Values under Order Constraints}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{5:1--5:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.5},
  URN =		{urn:nbn:de:0030-drops-70457},
  doi =		{10.4230/LIPIcs.ICDT.2017.5},
  annote =	{Keywords: uncertainty, partial order, unknown values, crowdsourcing, interpolation}
}
Document
Combined Tractability of Query Evaluation via Tree Automata and Cycluits

Authors: Antoine Amarilli, Pierre Bourhis, Mikaël Monet, and Pierre Senellart


Abstract
We investigate parameterizations of both database instances and queries that make query evaluation fixed-parameter tractable in combined complexity. We introduce a new Datalog fragment with stratified negation, intensional-clique-guarded Datalog (ICG-Datalog), with linear-time evaluation on structures of bounded treewidth for programs of bounded rule size. Such programs capture in particular conjunctive queries with simplicial decompositions of bounded width, guarded negation fragment queries of bounded CQ-rank, or two-way regular path queries. Our result is shown by compiling to alternating two-way automata, whose semantics is defined via cyclic provenance circuits (cycluits) that can be tractably evaluated. Last, we prove that probabilistic query evaluation remains intractable in combined complexity under this parameterization.

Cite as

Antoine Amarilli, Pierre Bourhis, Mikaël Monet, and Pierre Senellart. Combined Tractability of Query Evaluation via Tree Automata and Cycluits. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 6:1-6:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2017.6,
  author =	{Amarilli, Antoine and Bourhis, Pierre and Monet, Mika\"{e}l and Senellart, Pierre},
  title =	{{Combined Tractability of Query Evaluation via Tree Automata and Cycluits}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{6:1--6:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.6},
  URN =		{urn:nbn:de:0030-drops-70516},
  doi =		{10.4230/LIPIcs.ICDT.2017.6},
  annote =	{Keywords: query evaluation, tree automata, provenance, treewidth, circuits}
}
Document
The Complexity of Reverse Engineering Problems for Conjunctive Queries

Authors: Pablo Barceló and Miguel Romero


Abstract
Reverse engineering problems for conjunctive queries (CQs), such as query by example (QBE) or definability, take a set of user examples and convert them into an explanatory CQ. Despite their importance, the complexity of these problems is prohibitively high (coNEXPTIME-complete). We isolate their two main sources of complexity and propose relaxations of them that reduce the complexity while having meaningful theoretical interpretations. The first relaxation is based on the idea of using existential pebble games for approximating homomorphism tests. We show that this characterizes QBE/definability for CQs up to treewidth k, while reducing the complexity to EXPTIME. As a side result, we obtain that the complexity of the QBE/definability problems for CQs of treewidth k is EXPTIME-complete for each k > 1. The second relaxation is based on the idea of "desynchronizing" direct products, which characterizes QBE/definability for unions of CQs and reduces the complexity to coNP. The combination of these two relaxations yields tractability for QBE and characterizes it in terms of unions of CQs of treewidth at most k. We also study the complexity of these problems for conjunctive regular path queries over graph databases, showing them to be no more difficult than for CQs.

Cite as

Pablo Barceló and Miguel Romero. The Complexity of Reverse Engineering Problems for Conjunctive Queries. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 7:1-7:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2017.7,
  author =	{Barcel\'{o}, Pablo and Romero, Miguel},
  title =	{{The Complexity of Reverse Engineering Problems for Conjunctive Queries}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{7:1--7:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.7},
  URN =		{urn:nbn:de:0030-drops-70525},
  doi =		{10.4230/LIPIcs.ICDT.2017.7},
  annote =	{Keywords: reverse engineering, conjunctive queries, query by example, definability, treewidth, complexity of pebble games}
}
Document
Answering FO+MOD Queries Under Updates on Bounded Degree Databases

Authors: Christoph Berkholz, Jens Keppeler, and Nicole Schweikardt


Abstract
We investigate the query evaluation problem for fixed queries over fully dynamic databases, where tuples can be inserted or deleted. The task is to design a dynamic algorithm that immediately reports the new result of a fixed query after every database update. We consider queries in first-order logic (FO) and its extension with modulo-counting quantifiers (FO+MOD), and show that they can be efficiently evaluated under updates, provided that the dynamic database does not exceed a certain degree bound. In particular, we construct a data structure that allows to answer a Boolean FO+MOD query and to compute the size of the query result within constant time after every database update. Furthermore, after every update we are able to immediately enumerate the new query result with constant delay between the output tuples. The time needed to build the data structure is linear in the size of the database. Our results extend earlier work on the evaluation of first-order queries on static databases of bounded degree and rely on an effective Hanf normal form for FO+MOD recently obtained by [Heimberg, Kuske, and Schweikardt, LICS, 2016].

Cite as

Christoph Berkholz, Jens Keppeler, and Nicole Schweikardt. Answering FO+MOD Queries Under Updates on Bounded Degree Databases. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 8:1-8:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{berkholz_et_al:LIPIcs.ICDT.2017.8,
  author =	{Berkholz, Christoph and Keppeler, Jens and Schweikardt, Nicole},
  title =	{{Answering FO+MOD Queries Under Updates on Bounded Degree Databases}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{8:1--8:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.8},
  URN =		{urn:nbn:de:0030-drops-70535},
  doi =		{10.4230/LIPIcs.ICDT.2017.8},
  annote =	{Keywords: dynamic databases, query enumeration, counting problem, first-order logic with modulo-counting quantifiers, Hanf locality}
}
Document
How Many Variables Are Needed to Express an Existential Positive Query?

Authors: Simone Bova and Hubie Chen


Abstract
The number of variables used by a first-order query is a fundamental measure which has been studied in numerous contexts, and which is known to be highly relevant to the task of query evaluation. In this article, we study this measure in the context of existential positive queries. Building on previous work, we present a combinatorial quantity defined on existential positive queries; we show that this quantity not only characterizes the minimum number of variables needed to express a given existential positive query by another existential positive query, but also that it characterizes the minimum number of variables needed to express a given existential positive query, over all first-order queries. Put differently and loosely, we show that for any existential positive query, no variables can ever be saved by moving out of existential positive logic to first-order logic. One component of this theorem’s proof is the construction of a winning strategy for a certain Ehrenfeucht-Fraiissé type game.

Cite as

Simone Bova and Hubie Chen. How Many Variables Are Needed to Express an Existential Positive Query?. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 9:1-9:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{bova_et_al:LIPIcs.ICDT.2017.9,
  author =	{Bova, Simone and Chen, Hubie},
  title =	{{How Many Variables Are Needed to Express an Existential Positive Query?}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{9:1--9:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.9},
  URN =		{urn:nbn:de:0030-drops-70545},
  doi =		{10.4230/LIPIcs.ICDT.2017.9},
  annote =	{Keywords: existential positive queries, finite-variable logics, first-order logic, query optimization}
}
Document
Expressive Power of Entity-Linking Frameworks

Authors: Douglas Burdick, Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang-Chiew Tan


Abstract
We develop a unifying approach to declarative entity linking by introducing the notion of an entity linking framework and an accompanying notion of the certain links in such a framework. In an entity linking framework, logic-based constraints are used to express properties of the desired link relations in terms of source relations and, possibly, in terms of other link relations. The definition of the certain links in such a framework makes use of weighted repairs and consistent answers in inconsistent databases. We demonstrate the modeling capabilities of this approach by showing that numerous concrete entity linking scenarios can be cast as such entity linking frameworks for suitable choices of constraints and weights. By using the certain links as a measure of expressive power, we investigate the relative expressive power of several entity linking frameworks and obtain sharp comparisons. In particular, we show that we gain expressive power if we allow constraints that capture non-recursive collective entity resolution, where link relations may depend on other link relations (and not just on source relations). Moreover, we show that an increase in expressive power also takes place when we allow constraints that incorporate preferences as an additional mechanism for expressing "goodness" of links.

Cite as

Douglas Burdick, Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang-Chiew Tan. Expressive Power of Entity-Linking Frameworks. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 10:1-10:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{burdick_et_al:LIPIcs.ICDT.2017.10,
  author =	{Burdick, Douglas and Fagin, Ronald and Kolaitis, Phokion G. and Popa, Lucian and Tan, Wang-Chiew},
  title =	{{Expressive Power of Entity-Linking Frameworks}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{10:1--10:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.10},
  URN =		{urn:nbn:de:0030-drops-70554},
  doi =		{10.4230/LIPIcs.ICDT.2017.10},
  annote =	{Keywords: entity linking, entity resolution, constraints, repairs, certain links}
}
Document
k-Regret Minimizing Set: Efficient Algorithms and Hardness

Authors: Wei Cao, Jian Li, Haitao Wang, Kangning Wang, Ruosong Wang, Raymond Chi-Wing Wong, and Wei Zhan


Abstract
We study the k-regret minimizing query (k-RMS), which is a useful operator for supporting multi-criteria decision-making. Given two integers k and r, a k-RMS returns r tuples from the database which minimize the k-regret ratio, defined as one minus the worst ratio between the k-th maximum utility score among all tuples in the database and the maximum utility score of the r tuples returned. A solution set contains only r tuples, enjoying the benefits of both top-k queries and skyline queries. Proposed in 2012, the query has been studied extensively in recent years. In this paper, we advance the theory and the practice of k-RMS in the following aspects. First, we develop efficient algorithms for k-RMS (and its decision version) when the dimensionality is 2. The running time of our algorithms outperforms those of previous ones. Second, we show that k-RMS is NP-hard even when the dimensionality is 3. This provides a complete characterization of the complexity of k-RMS, and answers an open question in previous studies. In addition, we present approximation algorithms for the problem when the dimensionality is 3 or larger.

Cite as

Wei Cao, Jian Li, Haitao Wang, Kangning Wang, Ruosong Wang, Raymond Chi-Wing Wong, and Wei Zhan. k-Regret Minimizing Set: Efficient Algorithms and Hardness. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 11:1-11:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{cao_et_al:LIPIcs.ICDT.2017.11,
  author =	{Cao, Wei and Li, Jian and Wang, Haitao and Wang, Kangning and Wang, Ruosong and Chi-Wing Wong, Raymond and Zhan, Wei},
  title =	{{k-Regret Minimizing Set: Efficient Algorithms and Hardness}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{11:1--11:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.11},
  URN =		{urn:nbn:de:0030-drops-70569},
  doi =		{10.4230/LIPIcs.ICDT.2017.11},
  annote =	{Keywords: multi-criteria decision-making, regret minimizing set, top-k query}
}
Document
The Design of Arbitrage-Free Data Pricing Schemes

Authors: Shaleen Deep and Paraschos Koutris


Abstract
Motivated by a growing market that involves buying and selling data over the web, we study pricing schemes that assign value to queries issued over a database. Previous work studied pricing mechanisms that compute the price of a query by extending a data seller’s explicit prices on certain queries, or investigated the properties that a pricing function should exhibit without detailing a generic construction. In this work, we present a formal framework for pricing queries over data that allows the construction of general families of pricing functions, with the main goal of avoiding arbitrage. We consider two types of pricing schemes: instance-independent schemes, where the price depends only on the structure of the query, and answer-dependent schemes, where the price also depends on the query output. Our main result is a complete characterization of the structure of pricing functions in both settings, by relating it to properties of a function over a lattice. We use our characterization, together with information-theoretic methods, to construct a variety of arbitrage-free pricing functions. Finally, we discuss various tradeoffs in the design space and present techniques for efficient computation of the proposed pricing functions.

Cite as

Shaleen Deep and Paraschos Koutris. The Design of Arbitrage-Free Data Pricing Schemes. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 12:1-12:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{deep_et_al:LIPIcs.ICDT.2017.12,
  author =	{Deep, Shaleen and Koutris, Paraschos},
  title =	{{The Design of Arbitrage-Free Data Pricing Schemes}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{12:1--12:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.12},
  URN =		{urn:nbn:de:0030-drops-70574},
  doi =		{10.4230/LIPIcs.ICDT.2017.12},
  annote =	{Keywords: data pricing, determinacy, arbitrage}
}
Document
A Logic for Document Spanners

Authors: Dominik D. Freydenberger


Abstract
Document spanners are a formal framework for information extraction that was introduced by [Fagin, Kimelfeld, Reiss, and Vansummeren, J.ACM, 2015]. One of the central models in this framework are core spanners, which are based on regular expressions with variables that are then extended with an algebra. As shown by [Freydenberger and Holldack, ICDT, 2016], there is a connection between core spanners and EC^{reg}, the existential theory of concatenation with regular constraints. The present paper further develops this connection by defining SpLog, a fragment of EC^{reg} that has the same expressive power as core spanners. This equivalence extends beyond equivalence of expressive power, as we show the existence of polynomial time conversions between this fragment and core spanners. This even holds for variants of core spanners that are based on automata instead of regular expressions. Applications of this approach include an alternative way of defining relations for spanners, insights into the relative succinctness of various classes of spanner representations, and a pumping lemma for core spanners.

Cite as

Dominik D. Freydenberger. A Logic for Document Spanners. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 13:1-13:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{freydenberger:LIPIcs.ICDT.2017.13,
  author =	{Freydenberger, Dominik D.},
  title =	{{A Logic for Document Spanners}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{13:1--13:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.13},
  URN =		{urn:nbn:de:0030-drops-70493},
  doi =		{10.4230/LIPIcs.ICDT.2017.13},
  annote =	{Keywords: information extraction, document spanners, word equations, regex, descriptional complexity}
}
Document
Distributed Query Monitoring through Convex Analysis: Towards Composable Safe Zones

Authors: Minos Garofalakis and Vasilis Samoladas


Abstract
Continuous tracking of complex data analytics queries over high-speed distributed streams is becoming increasingly important. Query tracking can be reduced to continuous monitoring of a condition over the global stream. Communication-efficient monitoring relies on locally processing stream data at the sites where it is generated, by deriving site-local conditions which collectively guarantee the global condition. Recently proposed geometric techniques offer a generic approach for splitting an arbitrary global condition into local geometric monitoring constraints (known as "Safe Zones"); still, their application to various problem domains has so far been based on heuristics and lacking a principled, compositional methodology. In this paper, we present the first known formal results on the difficult problem of effective Safe Zone (SZ) design for complex query monitoring over distributed streams. Exploiting tools from convex analysis, our approach relies on an algebraic representation of SZs which allows us to: (1) Formally define the notion of a "good" SZ for distributed monitoring problems; and, most importantly, (2) Tackle and solve the important problem of systematically composing SZs for monitored conditions expressed as Boolean formulas over simpler conditions (for which SZs are known); furthermore, we prove that, under broad assumptions, the composed SZ is good if the component SZs are good. Our results are, therefore, a first step towards a principled compositional solution to SZ design for distributed query monitoring. Finally, we discuss a number of important applications for our SZ design algorithms, also demonstrating how earlier geometric techniques can be seen as special cases of our framework.

Cite as

Minos Garofalakis and Vasilis Samoladas. Distributed Query Monitoring through Convex Analysis: Towards Composable Safe Zones. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 14:1-14:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{garofalakis_et_al:LIPIcs.ICDT.2017.14,
  author =	{Garofalakis, Minos and Samoladas, Vasilis},
  title =	{{Distributed Query Monitoring through Convex Analysis: Towards Composable Safe Zones}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{14:1--14:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.14},
  URN =		{urn:nbn:de:0030-drops-70665},
  doi =		{10.4230/LIPIcs.ICDT.2017.14},
  annote =	{Keywords: distributed data streams, geometric method}
}
Document
Entropy Bounds for Conjunctive Queries with Functional Dependencies

Authors: Tomasz Gogacz and Szymon Torunczyk


Abstract
We study the problem of finding the worst-case size of the result Q(D) of a fixed conjunctive query Q applied to a database D satisfying given functional dependencies. We provide a characterization of this bound in terms of entropy vectors, and in terms of finite groups. In particular, we show that an upper bound provided by [Gottlob, Lee, Valiant and Valiant, J.ACM, 2012] is tight, and that a correspondence of [Chan and Yeung, ACM TOIT, 2002] is preserved in the presence of functional dependencies. However, tightness of a weaker upper bound provided by Gottlob et al., which would have immediate applications to evaluation of join queries ([Khamis, Ngo, and Suciu, PODS, 2016]) remains open. Our result shows that the problem of computing the worst-case size bound, in the general case, is closely related to difficult problems from information theory.

Cite as

Tomasz Gogacz and Szymon Torunczyk. Entropy Bounds for Conjunctive Queries with Functional Dependencies. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 15:1-15:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{gogacz_et_al:LIPIcs.ICDT.2017.15,
  author =	{Gogacz, Tomasz and Torunczyk, Szymon},
  title =	{{Entropy Bounds for Conjunctive Queries with Functional Dependencies}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{15:1--15:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.15},
  URN =		{urn:nbn:de:0030-drops-70479},
  doi =		{10.4230/LIPIcs.ICDT.2017.15},
  annote =	{Keywords: database theory, conjunctive queries, size bounds, entropy, finite groups, entropy cone}
}
Document
On the Automated Verification of Web Applications with Embedded SQL

Authors: Shachar Itzhaky, Tomer Kotek, Noam Rinetzky, Mooly Sagiv, Orr Tamir, Helmut Veith, and Florian Zuleger


Abstract
A large number of web applications is based on a relational database together with a program, typically a script, that enables the user to interact with the database through embedded SQL queries and commands. In this paper, we introduce a method for formal automated verification of such systems which connects database theory to mainstream program analysis. We identify a fragment of SQL which captures the behavior of the queries in our case studies, is algorithmically decidable, and facilitates the construction of weakest preconditions. Thus, we can integrate the analysis of SQL queries into a program analysis tool chain. To this end, we implement a new decision procedure for the SQL fragment that we introduce. We demonstrate practical applicability of our results with three case studies, a web administrator, a simple firewall, and a conference management system.

Cite as

Shachar Itzhaky, Tomer Kotek, Noam Rinetzky, Mooly Sagiv, Orr Tamir, Helmut Veith, and Florian Zuleger. On the Automated Verification of Web Applications with Embedded SQL. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 16:1-16:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{itzhaky_et_al:LIPIcs.ICDT.2017.16,
  author =	{Itzhaky, Shachar and Kotek, Tomer and Rinetzky, Noam and Sagiv, Mooly and Tamir, Orr and Veith, Helmut and Zuleger, Florian},
  title =	{{On the Automated Verification of Web Applications with Embedded SQL}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{16:1--16:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.16},
  URN =		{urn:nbn:de:0030-drops-70509},
  doi =		{10.4230/LIPIcs.ICDT.2017.16},
  annote =	{Keywords: SQL; scripting language; web services; program verification; two-variable fragment of first order logic; decidability; reasoning}
}
Document
Detecting Ambiguity in Prioritized Database Repairing

Authors: Benny Kimelfeld, Ester Livshits, and Liat Peterfreund


Abstract
In its traditional definition, a repair of an inconsistent database is a consistent database that differs from the inconsistent one in a "minimal way." Often, repairs are not equally legitimate, as it is desired to prefer one over another; for example, one fact is regarded more reliable than another, or a more recent fact should be preferred to an earlier one. Motivated by these considerations, researchers have introduced and investigated the framework of preferred repairs, in the context of denial constraints and subset repairs. There, a priority relation between facts is lifted towards a priority relation between consistent databases, and repairs are restricted to the ones that are optimal in the lifted sense. Three notions of lifting (and optimal repairs) have been proposed: Pareto, global, and completion. In this paper we investigate the complexity of deciding whether the priority relation suffices to clean the database unambiguously, or in other words, whether there is exactly one optimal repair. We show that the different lifting semantics entail highly different complexities. Under Pareto optimality, the problem is coNP-complete, in data complexity, for every set of functional dependencies (FDs), except for the tractable case of (equivalence to) one FD per relation. Under global optimality, one FD per relation is still tractable, but we establish Pi-2-p-completeness for a relation with two FDs. In contrast, under completion optimality the problem is solvable in polynomial time for every set of FDs. In fact, we present a polynomial-time algorithm for arbitrary conflict hypergraphs. We further show that under a general assumption of transitivity, this algorithm solves the problem even for global optimality. The algorithm is extremely simple, but its proof of correctness is quite intricate.

Cite as

Benny Kimelfeld, Ester Livshits, and Liat Peterfreund. Detecting Ambiguity in Prioritized Database Repairing. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 17:1-17:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{kimelfeld_et_al:LIPIcs.ICDT.2017.17,
  author =	{Kimelfeld, Benny and Livshits, Ester and Peterfreund, Liat},
  title =	{{Detecting Ambiguity in Prioritized Database Repairing}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{17:1--17:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.17},
  URN =		{urn:nbn:de:0030-drops-70489},
  doi =		{10.4230/LIPIcs.ICDT.2017.17},
  annote =	{Keywords: inconsistent databases, preferred repairs, data cleaning, functional dependencies, conflict hypergraph}
}
Document
Compression of Unordered XML Trees

Authors: Markus Lohrey, Sebastian Maneth, and Carl Philipp Reh


Abstract
Many XML documents are data-centric and do not make use of the inherent document order. Can we provide stronger compression for such documents through giving up order? We first consider compression via minimal dags (directed acyclic graphs) and study the worst case ratio of the size of the ordered dag divided by the size of the unordered dag, where the worst case is taken for all trees of size n. We prove that this worst case ratio is n / log n for the edge size and n log log n / log n for the node size. In experiments we compare several known compressors on the original document tree versus on a canonical version obtained by length-lexicographical sorting of subtrees. For some documents this difference is surprisingly large: reverse binary dags can be smaller by a factor of 3.7 and other compressors can be smaller by factors of up to 190.

Cite as

Markus Lohrey, Sebastian Maneth, and Carl Philipp Reh. Compression of Unordered XML Trees. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 18:1-18:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{lohrey_et_al:LIPIcs.ICDT.2017.18,
  author =	{Lohrey, Markus and Maneth, Sebastian and Reh, Carl Philipp},
  title =	{{Compression of Unordered XML Trees}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{18:1--18:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.18},
  URN =		{urn:nbn:de:0030-drops-70584},
  doi =		{10.4230/LIPIcs.ICDT.2017.18},
  annote =	{Keywords: tree compression, directed acyclic graphs, XML}
}
Document
Dynamic Complexity under Definable Changes

Authors: Thomas Schwentick, Nils Vortmeier, and Thomas Zeume


Abstract
This paper studies dynamic complexity under definable change operations in the DynFO framework by Patnaik and Immerman. It is shown that for changes definable by parameter-free first-order formulas, all (uniform) AC1 queries can be maintained by first-order dynamic programs. Furthermore, many maintenance results for single-tuple changes are extended to more powerful change operations: (1) The reachability query for undirected graphs is first-order maintainable under single tuple changes and first-order defined insertions, likewise the reachability query for directed acyclic graphs under quantifier-free insertions. (2) Context-free languages are first-order maintainable under \EFO-defined changes. These results are complemented by several inexpressibility results, for example, that the reachability query cannot be maintained by quantifier-free programs under definable, quantifier-free deletions.

Cite as

Thomas Schwentick, Nils Vortmeier, and Thomas Zeume. Dynamic Complexity under Definable Changes. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 19:1-19:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{schwentick_et_al:LIPIcs.ICDT.2017.19,
  author =	{Schwentick, Thomas and Vortmeier, Nils and Zeume, Thomas},
  title =	{{Dynamic Complexity under Definable Changes}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{19:1--19:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.19},
  URN =		{urn:nbn:de:0030-drops-70596},
  doi =		{10.4230/LIPIcs.ICDT.2017.19},
  annote =	{Keywords: dynamic descriptive complexity, SQL updates, dynamic programs}
}
Document
Constant Delay Enumeration for FO Queries over Databases with Local Bounded Expansion

Authors: Luc Segoufin and Alexandre Vigny


Abstract
We consider the evaluation of first-order queries over classes of databases with local bounded expansion. This class was introduced by Nesetril and Ossona de Mendez and generalizes many well known classes of databases, such as bounded degree, bounded tree width or bounded expansion. It is known that over classes of databases with local bounded expansion, first-order sentences can be evaluated in pseudo-linear time (pseudo-linear time means that for all \epsilon there exists an algorithm working in time O(n^{1+\epsilon})). Here, we investigate other scenarios, where queries are not sentences. We show that first-order queries can be enumerated with constant delay after a pseudo-linear preprocessing over any class of databases having locally bounded expansion. We also show that, in this context, counting the number of solutions can be done in pseudo-linear time.

Cite as

Luc Segoufin and Alexandre Vigny. Constant Delay Enumeration for FO Queries over Databases with Local Bounded Expansion. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 20:1-20:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{segoufin_et_al:LIPIcs.ICDT.2017.20,
  author =	{Segoufin, Luc and Vigny, Alexandre},
  title =	{{Constant Delay Enumeration for FO Queries over Databases with Local Bounded Expansion}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{20:1--20:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.20},
  URN =		{urn:nbn:de:0030-drops-70602},
  doi =		{10.4230/LIPIcs.ICDT.2017.20},
  annote =	{Keywords: enumeration, first-order queries, local bounded expansion.}
}
Document
m-tables: Representing Missing Data

Authors: Bruhathi Sundarmurthy, Paraschos Koutris, Willis Lang, Jeffrey Naughton, and Val Tannen


Abstract
Representation systems have been widely used to capture different forms of incomplete data in various settings. However, existing representation systems are not expressive enough to handle the more complex scenarios of missing data that can occur in practice: these could vary from missing attribute values, missing a known number of tuples, or even missing an unknown number of tuples. In this work, we propose a new representation system called m-tables, that can represent many different types of missing data. We show that m-tables form a closed, complete and strong representation system under both set and bag semantics and are strictly more expressive than conditional tables under both the closed and open world assumptions. We further study the complexity of computing certain and possible answers in m-tables. Finally, we discuss how to "interpret" m-tables through a novel labeling scheme that marks a type of generalized tuples as certain or possible.

Cite as

Bruhathi Sundarmurthy, Paraschos Koutris, Willis Lang, Jeffrey Naughton, and Val Tannen. m-tables: Representing Missing Data. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 21:1-21:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{sundarmurthy_et_al:LIPIcs.ICDT.2017.21,
  author =	{Sundarmurthy, Bruhathi and Koutris, Paraschos and Lang, Willis and Naughton, Jeffrey and Tannen, Val},
  title =	{{m-tables: Representing Missing Data}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{21:1--21:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.21},
  URN =		{urn:nbn:de:0030-drops-70618},
  doi =		{10.4230/LIPIcs.ICDT.2017.21},
  annote =	{Keywords: missing values, incomplete data, c tables, representation systems}
}
Document
Better Streaming Algorithms for the Maximum Coverage Problem

Authors: Andrew McGregor and Hoa T. Vu


Abstract
We study the classic NP-Hard problem of finding the maximum k-set coverage in the data stream model: given a set system of m sets that are subsets of a universe {1,...,n}, find the k sets that cover the most number of distinct elements. The problem can be approximated up to a factor 1-1/e in polynomial time. In the streaming-set model, the sets and their elements are revealed online. The main goal of our work is to design algorithms, with approximation guarantees as close as possible to 1-1/e, that use sublinear space o(mn). Our main results are: 1) Two (1-1/e-epsilon) approximation algorithms: One uses O(1/epsilon) passes and O(k/epsilon^2 polylog(m,n)) space whereas the other uses only a single pass but O(m/epsilon^2 polylog(m,n)) space. 2) We show that any approximation factor better than (1-(1-1/k)^k) in constant passes require space that is linear in m for constant k even if the algorithm is allowed unbounded processing time. We also demonstrate a single-pass, (1-epsilon) approximation algorithm using O(m/epsilon^2 min(k,1/epsilon) polylog(m,n)) space. We also study the maximum k-vertex coverage problem in the dynamic graph stream model. In this model, the stream consists of edge insertions and deletions of a graph on N vertices. The goal is to find k vertices that cover the most number of distinct edges. We show that any constant approximation in constant passes requires space that is linear in N for constant k whereas O(N/epsilon^2 polylog(m,n)) space is sufficient for a (1-epsilon) approximation and arbitrary k in a single pass. For regular graphs, we show that O(k/epsilon^3 polylog(m,n)) space is sufficient for a (1-epsilon) approximation in a single pass. We generalize this to a K-epsilon approximation when the ratio between the minimum and maximum degree is bounded below by K.

Cite as

Andrew McGregor and Hoa T. Vu. Better Streaming Algorithms for the Maximum Coverage Problem. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 22:1-22:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{mcgregor_et_al:LIPIcs.ICDT.2017.22,
  author =	{McGregor, Andrew and Vu, Hoa T.},
  title =	{{Better Streaming Algorithms for the Maximum Coverage Problem}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{22:1--22:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.22},
  URN =		{urn:nbn:de:0030-drops-70620},
  doi =		{10.4230/LIPIcs.ICDT.2017.22},
  annote =	{Keywords: algorithms, data streams, approximation, maximum coverage}
}

Filters


Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail