18th International Conference on Database Theory (ICDT 2015)

Document

Complete Volume

DOI: 10.4230/LIPIcs.ICDT.2015

LIPIcs, Volume 31, ICDT'15, Complete Volume

Authors: Marcelo Arenas and Martín Ugarte

Abstract

LIPIcs, Volume 31, ICDT'15, Complete Volume

Cite as

18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@Proceedings{arenas_et_al:LIPIcs.ICDT.2015,
  title =	{{LIPIcs, Volume 31, ICDT'15, Complete Volume}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015},
  URN =		{urn:nbn:de:0030-drops-50077},
  doi =		{10.4230/LIPIcs.ICDT.2015},
  annote =	{Keywords: Database Management, Normal forms, Schema and subschema, Query languages, Query processing, Relational databases, Distributed databases, Heterogeneous Databases, Online Information Services, Miscellaneous – Privacy, Office Automation: Workflow management, Performance Analysis and Design Aids: Formal}
}

Document

Front Matter

DOI: 10.4230/LIPIcs.ICDT.2015.i

Title, Table of Contents, Preface, ICDT 2015 Test of Time Award, Organization, External Reviewers, List of Authors

Authors: Marcelo Arenas and Martín Ugarte

Abstract

Title, Table of Contents, Preface, ICDT 2015 Test of Time Award, Organization, External Reviewers, List of Authors

Cite as

18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. i-xvi, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{arenas_et_al:LIPIcs.ICDT.2015.i,
  author =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  title =	{{Title, Table of Contents, Preface, ICDT 2015 Test of Time Award, Organization, External Reviewers, List of Authors}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{i--xvi},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.i},
  URN =		{urn:nbn:de:0030-drops-50002},
  doi =		{10.4230/LIPIcs.ICDT.2015.i},
  annote =	{Keywords: Title, Table of Contents, Preface, ICDT 2015 Test of Time Award, Organization, External Reviewers, List of Authors}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2015.1

The Confounding Problem of Private Data Release (Invited Talk)

Authors: Graham Cormode

Abstract

The demands to make data available are growing ever louder, including open data initiatives and "data monetization". But the problem of doing so without disclosing confidential information is a subtle and difficult one. Is "private data release" an oxymoron? This paper (accompanying an invited talk) aims to delve into the motivations of data release, explore the challenges, and outline some of the current statistical approaches developed in response to this confounding problem.

Cite as

Graham Cormode. The Confounding Problem of Private Data Release (Invited Talk). In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 1-12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{cormode:LIPIcs.ICDT.2015.1,
  author =	{Cormode, Graham},
  title =	{{The Confounding Problem of Private Data Release}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{1--12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.1},
  URN =		{urn:nbn:de:0030-drops-49977},
  doi =		{10.4230/LIPIcs.ICDT.2015.1},
  annote =	{Keywords: privacy, anonymization, data release}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2015.13

Using Locality for Efficient Query Evaluation in Various Computation Models (Invited Talk)

Authors: Nicole Schweikardt

Abstract

In the database theory and logic literature, different notions of locality of queries have been studied, the most prominent being Hanf locality and Gaifman locality. These notions are designed so that, in order to evaluate a local query in a given database, it suffices to look only at small neighbourhoods around tuples of elements that belong to the database. In this talk I want to give a survey of how to use locality for efficient query evaluation in various computation models. In particular, we will take a closer look at how to enumerate query results with constant delay, and at how to evaluate queries in a map-reduce like setting [Neven et al., ICDT 2015] or in Pregel [Malewicz et al., SIGMOD 2010]. Also, we will have a closer look at how to transform a given local query into a form suitable for exploiting its locality.

Cite as

Nicole Schweikardt. Using Locality for Efficient Query Evaluation in Various Computation Models (Invited Talk). In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 13-14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{schweikardt:LIPIcs.ICDT.2015.13,
  author =	{Schweikardt, Nicole},
  title =	{{Using Locality for Efficient Query Evaluation in Various Computation Models}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{13--14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.13},
  URN =		{urn:nbn:de:0030-drops-49987},
  doi =		{10.4230/LIPIcs.ICDT.2015.13},
  annote =	{Keywords: query evaluation, locality}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2015.15

Large-Scale Similarity Joins With Guarantees (Invited Talk)

Authors: Rasmus Pagh

Abstract

The ability to handle noisy or imprecise data is becoming increasingly important in computing. In the database community the notion of similarity join has been studied extensively, yet existing solutions have offered weak performance guarantees. Either they are based on deterministic filtering techniques that often, but not always, succeed in reducing computational costs, or they are based on randomized techniques that have improved guarantees on computational cost but come with a probability of not returning the correct result. The aim of this paper is to give an overview of randomized techniques for high-dimensional similarity search, and discuss recent advances towards making these techniques more widely applicable by eliminating probability of error and improving the locality of data access.

Cite as

Rasmus Pagh. Large-Scale Similarity Joins With Guarantees (Invited Talk). In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 15-24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{pagh:LIPIcs.ICDT.2015.15,
  author =	{Pagh, Rasmus},
  title =	{{Large-Scale Similarity Joins With Guarantees}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{15--24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.15},
  URN =		{urn:nbn:de:0030-drops-49995},
  doi =		{10.4230/LIPIcs.ICDT.2015.15},
  annote =	{Keywords: Similarity join, filtering, locality-sensitive hashing, recall}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.25

A Declarative Framework for Linking Entities

Authors: Douglas Burdick, Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang-Chiew Tan

Abstract

The aim of this paper is to introduce and develop a truly declarative framework for entity linking and, in particular, for entity resolution. As in some earlier approaches, our framework is based on the systematic use of constraints. However, the constraints we adopt are link-to-source constraints, unlike in earlier approaches where source-to-link constraints were used to dictate how to generate links. Our approach makes it possible to focus entirely on the intended properties of the outcome of entity linking, thus separating the constraints from any procedure of how to achieve that outcome. The core language consists of link-to-source constraints that specify the desired properties of a link relation in terms of source relations and built-in predicates such as similarity measures. A key feature of the link-to-source constraints is that they employ disjunction, which enables the declarative listing of all the reasons as to why two entities should be linked. We also consider extensions of the core language that capture collective entity resolution, by allowing inter-dependence between links. We identify a class of "good" solutions for entity linking specifications, which we call maximum-value solutions and which capture the strength of a link by counting the reasons that justify it. We study natural algorithmic problems associated with these solutions, including the problem of enumerating the "good" solutions, and the problem of finding the certain links, which are the links that appear in every "good" solution. We show that these problems are tractable for the core language, but may become intractable once we allow inter-dependence between link relations. We also make some surprising connections between our declarative framework, which is deterministic, and probabilistic approaches such as ones based on Markov Logic Networks.

Cite as

Douglas Burdick, Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang-Chiew Tan. A Declarative Framework for Linking Entities. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 25-43, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{burdick_et_al:LIPIcs.ICDT.2015.25,
  author =	{Burdick, Douglas and Fagin, Ronald and Kolaitis, Phokion G. and Popa, Lucian and Tan, Wang-Chiew},
  title =	{{A Declarative Framework for Linking Entities}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{25--43},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.25},
  URN =		{urn:nbn:de:0030-drops-49759},
  doi =		{10.4230/LIPIcs.ICDT.2015.25},
  annote =	{Keywords: entity linking, entity resolution, constraints, certain links}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.44

Asymptotic Determinacy of Path Queries using Union-of-Paths Views

Authors: Nadime Francis

Abstract

We consider the view determinacy problem over graph databases for queries defined as (possibly infinite) unions of path queries. These queries select pairs of nodes in a graph that are connected through a path whose length falls in a given set. A view specification is a set of such queries. We say that a view specification V determines a query Q if, for all databases D, the answers to V on D contain enough information to answer Q. Our main result states that, given a view V, there exists an explicit bound that depends on V such that we can decide the determinacy problem for all queries that ask for a path longer than this bound, and provide first-order rewritings for the queries that are determined. We call this notion asymptotic determinacy. As a corollary, we can also compute the set of almost all path queries that are determined by V.

Cite as

Nadime Francis. Asymptotic Determinacy of Path Queries using Union-of-Paths Views. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 44-59, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{francis:LIPIcs.ICDT.2015.44,
  author =	{Francis, Nadime},
  title =	{{Asymptotic Determinacy of Path Queries using Union-of-Paths Views}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{44--59},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.44},
  URN =		{urn:nbn:de:0030-drops-49760},
  doi =		{10.4230/LIPIcs.ICDT.2015.44},
  annote =	{Keywords: Graph databases, Views, Determinacy, Rewriting, Path queries}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.60

Games for Active XML Revisited

Authors: Martin Schuster and Thomas Schwentick

Abstract

The paper studies the rewriting mechanisms for intensional documents in the Active XML framework, abstracted in the form of active context-free games. The safe rewriting problem studied in this paper is to decide whether the first player, Juliet, has a winning strategy for a given game and (nested) word; this corresponds to a successful rewriting strategy for a given intensional document. The paper examines several extensions to active context-free games. The primary extension allows more expressive schemas (namely XML schemas and regular nested word languages) for both target and replacement languages and has the effect that games are played on nested words instead of (flat) words as in previous studies. Other extensions consider validation of input parameters of web services, and an alternative semantics based on insertion of service call results. In general, the complexity of the safe rewriting problem is highly intractable (doubly exponential time), but the paper identifies interesting tractable cases.

Cite as

Martin Schuster and Thomas Schwentick. Games for Active XML Revisited. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 60-75, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{schuster_et_al:LIPIcs.ICDT.2015.60,
  author =	{Schuster, Martin and Schwentick, Thomas},
  title =	{{Games for Active XML Revisited}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{60--75},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.60},
  URN =		{urn:nbn:de:0030-drops-49773},
  doi =		{10.4230/LIPIcs.ICDT.2015.60},
  annote =	{Keywords: Active XML, Computational Complexity, Nested Words, Rewriting Games, Semistructured Data}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.76

Answering Conjunctive Queries with Inequalities

Authors: Paraschos Koutris, Tova Milo, Sudeepa Roy, and Dan Suciu

Abstract

In this parer, we study the complexity of answering conjunctive queries (CQ) with inequalities. In particular, we compare the complexity of the query with and without inequalities. The main contribution of our work is a novel combinatorial technique that enables the use of any Select-Project-Join query plan for a given CQ without inequalities in answering the CQ with inequalities, with an additional factor in running time that only depends on the query. To achieve this, we define a new projection operator that keeps a small representation (independent of the size of the database) of the set of input tuples that map to each tuple in the output of the projection; this representation is used to evaluate all the inequalities in the query. Second, we generalize a result by Papadimitriou-Yannakakis [PODS'97] and give an alternative algorithm based on the color-coding technique [Alon, Yuster and Zwick, PODS'02] to evaluate a CQ with inequalities by using an algorithm for the CQ without inequalities. Third, we investigate the structure of the query graph, inequality graph, and the augmented query graph with inequalities, and show that even if the query and the inequality graphs have bounded treewidth, the augmented graph not only can have an unbounded treewidth but can also be NP-hard to evaluate. Further, we illustrate classes of queries and inequalities where the augmented graphs have unbounded treewidth, but the CQ with inequalities can be evaluated in poly-time. Finally, we give necessary properties and sufficient properties that allow a class of CQs to have poly-time combined complexity with respect to any inequality pattern.

Cite as

Paraschos Koutris, Tova Milo, Sudeepa Roy, and Dan Suciu. Answering Conjunctive Queries with Inequalities. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 76-93, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{koutris_et_al:LIPIcs.ICDT.2015.76,
  author =	{Koutris, Paraschos and Milo, Tova and Roy, Sudeepa and Suciu, Dan},
  title =	{{Answering Conjunctive Queries with Inequalities}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{76--93},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.76},
  URN =		{urn:nbn:de:0030-drops-49781},
  doi =		{10.4230/LIPIcs.ICDT.2015.76},
  annote =	{Keywords: query evaluation, conjunctive query, inequality, treewidth}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.94

SQL's Three-Valued Logic and Certain Answers

Authors: Leonid Libkin

Abstract

SQL uses three-valued logic for evaluating queries on databases with nulls. The standard theoretical approach to evaluating queries on incomplete databases is to compute certain answers. While these two cannot coincide, due to a significant complexity mismatch, we can still ask whether the two schemes are related in any way. For instance, does SQL always produce answers we can be certain about? This is not so: SQL's and certain answers semantics could be totally unrelated. We show, however, that a slight modification of the three-valued semantics for relational calculus queries can provide the required certainty guarantees. The key point of the new scheme is to fully utilize the three-valued semantics, and classify answers not into certain or non-certain, as was done before, but rather into certainly true, certainly false, or unknown. This yields relatively small changes to the evaluation procedure, which we consider at the level of both declarative (relational calculus) and procedural (relational algebra) queries. We also introduce a new notion of certain answers with nulls, which properly accounts for queries returning tuples containing null values.

Cite as

Leonid Libkin. SQL's Three-Valued Logic and Certain Answers. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 94-109, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{libkin:LIPIcs.ICDT.2015.94,
  author =	{Libkin, Leonid},
  title =	{{SQL's Three-Valued Logic and Certain Answers}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{94--109},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.94},
  URN =		{urn:nbn:de:0030-drops-49791},
  doi =		{10.4230/LIPIcs.ICDT.2015.94},
  annote =	{Keywords: Null values, incomplete information, query evaluation, three-valued logic, certain answers}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.110

A Trichotomy in the Complexity of Counting Answers to Conjunctive Queries

Authors: Hubie Chen and Stefan Mengel

Abstract

Conjunctive queries are basic and heavily studied database queries; in relational algebra, they are the select-project-join queries. In this article, we study the fundamental problem of counting, given a conjunctive query and a relational database, the number of answers to the query on the database. In particular, we study the complexity of this problem relative to sets of conjunctive queries. We present a trichotomy theorem, which shows essentially that this problem on a set of conjunctive queries is either tractable, equivalent to the parameterized CLIQUE problem, or as hard as the parameterized counting CLIQUE problem; the criteria describing which of these situations occurs is simply stated, in terms of graph-theoretic conditions.

Cite as

Hubie Chen and Stefan Mengel. A Trichotomy in the Complexity of Counting Answers to Conjunctive Queries. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 110-126, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ICDT.2015.110,
  author =	{Chen, Hubie and Mengel, Stefan},
  title =	{{A Trichotomy in the Complexity of Counting Answers to Conjunctive Queries}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{110--126},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.110},
  URN =		{urn:nbn:de:0030-drops-49804},
  doi =		{10.4230/LIPIcs.ICDT.2015.110},
  annote =	{Keywords: database theory, query answering, conjunctive queries, counting complexity}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.127

Learning Tree Patterns from Example Graphs

Authors: Sara Cohen and Yaacov Y. Weiss

Abstract

This paper investigates the problem of learning tree patterns that return nodes with a given set of labels, from example graphs provided by the user. Example graphs are annotated by the user as being either positive or negative. The goal is then to determine whether there exists a tree pattern returning tuples of nodes with the given labels in each of the positive examples, but in none of the negative examples, and, furthermore, to find one such pattern if it exists. These are called the satisfiability and learning problems, respectively. This paper thoroughly investigates the satisfiability and learning problems in a variety of settings. In particular, we consider example sets that (1) may contain only positive examples, or both positive and negative examples, (2) may contain directed or undirected graphs, and (3) may have multiple occurrences of labels or be uniquely labeled (to some degree). In addition, we consider tree patterns of different types that can allow, or prohibit, wildcard labeled nodes and descendant edges. We also consider two different semantics for mapping tree patterns to graphs. The complexity of satisfiability is determined for the different combinations of settings. For cases in which satisfiability is polynomial, it is also shown that learning is polynomial (This is non-trivial as satisfying patterns may be exponential in size). Finally, the minimal learning problem, i.e., that of finding a minimal-sized satisfying pattern, is studied for cases in which satisfiability is polynomial.

Cite as

Sara Cohen and Yaacov Y. Weiss. Learning Tree Patterns from Example Graphs. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 127-143, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{cohen_et_al:LIPIcs.ICDT.2015.127,
  author =	{Cohen, Sara and Weiss, Yaacov Y.},
  title =	{{Learning Tree Patterns from Example Graphs}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{127--143},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.127},
  URN =		{urn:nbn:de:0030-drops-49819},
  doi =		{10.4230/LIPIcs.ICDT.2015.127},
  annote =	{Keywords: tree patterns, learning, examples}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.144

Characterizing XML Twig Queries with Examples

Authors: Slawek Staworko and Piotr Wieczorek

Abstract

Typically, a (Boolean) query is a finite formula that defines a possibly infinite set of database instances that satisfy it (positive examples), and implicitly, the set of instances that do not satisfy the query (negative examples). We investigate the following natural question: for a given class of queries, is it possible to characterize every query with a finite set of positive and negative examples that no other query is consistent with. We study this question for twig queries and XML databases. We show that while twig queries are characterizable, they generally require exponential sets of examples. Consequently, we focus on a practical subclass of anchored twig queries and show that not only are they characterizable but also with polynomially-sized sets of examples. This result is obtained with the use of generalization operations on twig queries, whose application to an anchored twig query yields a properly contained and minimally different query. Our results illustrate further interesting and strong connections between the structure and the semantics of anchored twig queries that the class of arbitrary twig queries does not enjoy. Finally, we show that the class of unions of twig queries is not characterizable.

Cite as

Slawek Staworko and Piotr Wieczorek. Characterizing XML Twig Queries with Examples. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 144-160, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{staworko_et_al:LIPIcs.ICDT.2015.144,
  author =	{Staworko, Slawek and Wieczorek, Piotr},
  title =	{{Characterizing XML Twig Queries with Examples}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{144--160},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.144},
  URN =		{urn:nbn:de:0030-drops-49828},
  doi =		{10.4230/LIPIcs.ICDT.2015.144},
  annote =	{Keywords: Query characterization, Query examples, Query fitting, Twig queries}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.161

The Product Homomorphism Problem and Applications

Authors: Balder ten Cate and Victor Dalmau

Abstract

The product homomorphism problem (PHP) takes as input a finite collection of structures A_1, ..., A_n and a structure B, and asks if there is a homomorphism from the direct product between A_1, A_2, ..., and A_n, to B. We pinpoint the computational complexity of this problem. Our motivation stems from the fact that PHP naturally arises in different areas of database theory. In particular, it is equivalent to the problem of determining whether a relation is definable by a conjunctive query, and the existence of a schema mapping that fits a given collection of positive and negative data examples. We apply our results to obtain complexity bounds for these problems.

Cite as

Balder ten Cate and Victor Dalmau. The Product Homomorphism Problem and Applications. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 161-176, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{tencate_et_al:LIPIcs.ICDT.2015.161,
  author =	{ten Cate, Balder and Dalmau, Victor},
  title =	{{The Product Homomorphism Problem and Applications}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{161--176},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.161},
  URN =		{urn:nbn:de:0030-drops-49832},
  doi =		{10.4230/LIPIcs.ICDT.2015.161},
  annote =	{Keywords: Homomorphisms, Direct Product, Data Examples, Definability, Conjunctive Queries, Schema Mappings}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.177

Regular Queries on Graph Databases

Authors: Juan L. Reutter, Miguel Romero, and Moshe Y. Vardi

Abstract

Graph databases are currently one of the most popular paradigms for storing data. One of the key conceptual differences between graph and relational databases is the focus on navigational queries that ask whether some nodes are connected by paths satisfying certain restrictions. This focus has driven the definition of several different query languages and the subsequent study of their fundamental properties. We define the graph query language of Regular Queries, which is a natural extension of unions of conjunctive 2-way regular path queries (UC2RPQs) and unions of conjunctive nested 2-way regular path queries (UCN2RPQs). Regular queries allow expressing complex regular patterns between nodes. We formalize regular queries as nonrecursive Datalog programs with transitive closure rules. This language has been previously considered, but its algorithmic properties are not well understood. Our main contribution is to show elementary tight bounds for the containment problem for regular queries. Specifically, we show that this problem is 2EXPSPACE-complete. For all extensions of regular queries known to date, the containment problem turns out to be non-elementary. Together with the fact that evaluating regular queries is not harder than evaluating UCN2RPQs, our results show that regular queries achieve a good balance between expressiveness and complexity, and constitute a well-behaved class that deserves further investigation.

Cite as

Juan L. Reutter, Miguel Romero, and Moshe Y. Vardi. Regular Queries on Graph Databases. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 177-194, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{reutter_et_al:LIPIcs.ICDT.2015.177,
  author =	{Reutter, Juan L. and Romero, Miguel and Vardi, Moshe Y.},
  title =	{{Regular Queries on Graph Databases}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{177--194},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.177},
  URN =		{urn:nbn:de:0030-drops-49842},
  doi =		{10.4230/LIPIcs.ICDT.2015.177},
  annote =	{Keywords: graph databases, conjunctive regular path queries, regular queries, containment.}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.195

Complexity and Expressiveness of ShEx for RDF

Authors: Slawek Staworko, Iovka Boneva, Jose E. Labra Gayo, Samuel Hym, Eric G. Prud'hommeaux, and Harold Solbrig

Abstract

We study the expressiveness and complexity of Shape Expression Schema (ShEx), a novel schema formalism for RDF currently under development by W3C. A ShEx assigns types to the nodes of an RDF graph and allows to constrain the admissible neighborhoods of nodes of a given type with regular bag expressions (RBEs). We formalize and investigate two alternative semantics, multi- and single-type, depending on whether or not a node may have more than one type. We study the expressive power of ShEx and study the complexity of the validation problem. We show that the single-type semantics is strictly more expressive than the multi-type semantics, single-type validation is generally intractable and multi-type validation is feasible for a small (yet practical) subclass of RBEs. To curb the high computational complexity of validation, we propose a natural notion of determinism and show that multi-type validation for the class of deterministic schemas using single-occurrence regular bag expressions (SORBEs) is tractable.

Cite as

Slawek Staworko, Iovka Boneva, Jose E. Labra Gayo, Samuel Hym, Eric G. Prud'hommeaux, and Harold Solbrig. Complexity and Expressiveness of ShEx for RDF. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 195-211, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{staworko_et_al:LIPIcs.ICDT.2015.195,
  author =	{Staworko, Slawek and Boneva, Iovka and Labra Gayo, Jose E. and Hym, Samuel and Prud'hommeaux, Eric G. and Solbrig, Harold},
  title =	{{Complexity and Expressiveness of ShEx for RDF}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{195--211},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.195},
  URN =		{urn:nbn:de:0030-drops-49856},
  doi =		{10.4230/LIPIcs.ICDT.2015.195},
  annote =	{Keywords: RDF, Schema, Graph topology, Validation, Complexity, Expressiveness}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.212

CONSTRUCT Queries in SPARQL

Authors: Egor V. Kostylev, Juan L. Reutter, and Martín Ugarte

Abstract

SPARQL has become the most popular language for querying RDF datasets, the standard data model for representing information in the Web. This query language has received a good deal of attention in the last few years: two versions of W3C standards have been issued, several SPARQL query engines have been deployed, and important theoretical foundations have been laid. However, many fundamental aspects of SPARQL queries are not yet fully understood. To this end, it is crucial to understand the correspondence between SPARQL and well-developed frameworks like relational algebra or first order logic. But one of the main obstacles on the way to such understanding is the fact that the well-studied fragments of SPARQL do not produce RDF as output. In this paper we embark on the study of SPARQL CONSTRUCT queries, that is, queries which output RDF graphs. This class of queries takes rightful place in the standards and implementations, but contrary to SELECT queries, it has not yet attracted a worth-while theoretical research. Under this framework we are able to establish a strong connection between SPARQL and well-known logical and database formalisms. In particular, the fragment which does not allow for blank nodes in output templates corresponds to first order queries, its well-designed sub-fragment corresponds to positive first order queries, and the general language can be re-stated as a data exchange setting. These correspondences allow us to conclude that the general language is not composable, but the aforementioned blank-free fragments are. Finally, we enrich SPARQL with a recursion operator and establish fundamental properties of this extension.

Cite as

Egor V. Kostylev, Juan L. Reutter, and Martín Ugarte. CONSTRUCT Queries in SPARQL. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 212-229, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{kostylev_et_al:LIPIcs.ICDT.2015.212,
  author =	{Kostylev, Egor V. and Reutter, Juan L. and Ugarte, Mart{\'\i}n},
  title =	{{CONSTRUCT Queries in SPARQL}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{212--229},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.212},
  URN =		{urn:nbn:de:0030-drops-49866},
  doi =		{10.4230/LIPIcs.ICDT.2015.212},
  annote =	{Keywords: RDF, SPARQL, Query Languages}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.230

Separability by Short Subsequences and Subwords

Authors: Piotr Hofman and Wim Martens

Abstract

The separability problem for regular languages asks, given two regular languages I and E, whether there exists a language S that separates the two, that is, includes I but contains nothing from E. Typically, S comes from a simple, less expressive class of languages than I and E. In general, a simple separator $S$ can be seen as an approximation of I or as an explanation of how I and E are different. In a database context, separators can be used for explaining the result of regular path queries or for finding explanations for the difference between paths in a graph database, that is, how paths from given nodes u_1 to v_1 are different from those from u_2 to v_2. We study the complexity of separability of regular languages by combinations of subsequences or subwords of a given length k. The rationale is that the parameter k can be used to influence the size and simplicity of the separator. The emphasis of our study is on tracing the tractability of the problem.

Cite as

Piotr Hofman and Wim Martens. Separability by Short Subsequences and Subwords. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 230-246, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{hofman_et_al:LIPIcs.ICDT.2015.230,
  author =	{Hofman, Piotr and Martens, Wim},
  title =	{{Separability by Short Subsequences and Subwords}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{230--246},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.230},
  URN =		{urn:nbn:de:0030-drops-49878},
  doi =		{10.4230/LIPIcs.ICDT.2015.230},
  annote =	{Keywords: separability, complexity, graph data, debugging}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.247

Process-Centric Views of Data-Driven Business Artifacts

Authors: Adrien Koutsos and Victor Vianu

Abstract

Declarative, data-aware workflow models are becoming increasingly pervasive. While these have numerous benefits, classical process-centric specifications retain certain advantages. Workflow designers are used to development tools such as BPMN or UML diagrams, that focus on control flow. Views describing valid sequences of tasks are also useful to provide stake-holders with high-level descriptions of the workflow, stripped of the accompanying data. In this paper we study the problem of recovering process-centric views from declarative, data-aware workflow specifications in a variant of IBM's business artifact model. We focus on the simplest and most natural process-centric views, specified by finite-state transition systems, and describing regular languages. The results characterize when process-centric views of artifact systems are regular, using both linear and branching-time semantics. We also study the impact of data dependencies on regularity of the views.

Cite as

Adrien Koutsos and Victor Vianu. Process-Centric Views of Data-Driven Business Artifacts. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 247-264, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{koutsos_et_al:LIPIcs.ICDT.2015.247,
  author =	{Koutsos, Adrien and Vianu, Victor},
  title =	{{Process-Centric Views of Data-Driven Business Artifacts}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{247--264},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.247},
  URN =		{urn:nbn:de:0030-drops-49886},
  doi =		{10.4230/LIPIcs.ICDT.2015.247},
  annote =	{Keywords: Workflows, data-aware, process-centric, views}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.265

On The I/O Complexity of Dynamic Distinct Counting

Authors: Xiaocheng Hu, Yufei Tao, Yi Yang, Shengyu Zhang, and Shuigeng Zhou

Abstract

In dynamic distinct counting, we want to maintain a multi-set S of integers under insertions to answer efficiently the query: how many distinct elements are there in S? In external memory, the problem admits two standard solutions. The first one maintains $S$ in a hash structure, so that the distinct count can be incrementally updated after each insertion using O(1) expected I/Os. A query is answered for free. The second one stores S in a linked list, and thus supports an insertion in O(1/B) amortized I/Os. A query can be answered in O(N/B log_{M/B} (N/B)) I/Os by sorting, where N=|S|, B is the block size, and M is the memory size. In this paper, we show that the above two naive solutions are already optimal within a polylog factor. Specifically, for any Las Vegas structure using N^{O(1)} blocks, if its expected amortized insertion cost is o(1/log B}), then it must incur Omega(N/(B log B)) expected I/Os answering a query in the worst case, under the (realistic) condition that N is a polynomial of B. This means that the problem is repugnant to update buffering: the query cost jumps from 0 dramatically to almost linearity as soon as the insertion cost drops slightly below Omega(1).

Cite as

Xiaocheng Hu, Yufei Tao, Yi Yang, Shengyu Zhang, and Shuigeng Zhou. On The I/O Complexity of Dynamic Distinct Counting. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 265-276, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{hu_et_al:LIPIcs.ICDT.2015.265,
  author =	{Hu, Xiaocheng and Tao, Yufei and Yang, Yi and Zhang, Shengyu and Zhou, Shuigeng},
  title =	{{On The I/O Complexity of Dynamic Distinct Counting}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{265--276},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.265},
  URN =		{urn:nbn:de:0030-drops-49895},
  doi =		{10.4230/LIPIcs.ICDT.2015.265},
  annote =	{Keywords: distinct counting, lower bound, external memory}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.277

Shared-Constraint Range Reporting

Authors: Sudip Biswas, Manish Patil, Rahul Shah, and Sharma V. Thankachan

Abstract

Orthogonal range reporting is one of the classic and most fundamental data structure problems. (2,1,1) query is a 3 dimensional query with two-sided constraint on the first dimension and one sided constraint on each of the 2nd and 3rd dimension. Given a set of N points in three dimension, a particular formulation of such a (2,1,1) query (known as four-sided range reporting in three-dimension) asks to report all those K points within a query region [a, b]X(-infinity, c]X[d, infinity). These queries have overall 4 constraints. In Word-RAM model, the best known structure capable of answering such queries with optimal query time takes O(N log^{epsilon} N) space, where epsilon>0 is any positive constant. It has been shown that any external memory structure in optimal I/Os must use Omega(N log N/ log log_B N) space (in words), where B is the block size [Arge et al., PODS 1999]. In this paper, we study a special type of (2,1,1) queries, where the query parameters a and c are the same i.e., a=c. Even though the query is still four-sided, the number of independent constraints is only three. In other words, one constraint is shared. We call this as a Shared-Constraint Range Reporting (SCRR) problem. We study this problem in both internal as well as external memory models. In RAM model where coordinates can only be compared, we achieve linear-space and O(log N+K) query time solution, matching the best-known three dimensional dominance query bound. Whereas in external memory, we present a linear space structure with O(log_B N + log log N + K/B) query I/Os. We also present an I/O-optimal (i.e., O(log_B N+K/B) I/Os) data structure which occupies O(N log log N)-word space. We achieve these results by employing a novel divide and conquer approach. SCRR finds application in database queries containing sharing among the constraints. We also show that SCRR queries naturally arise in many well known problems such as top-k color reporting, range skyline reporting and ranked document retrieval.

Cite as

Sudip Biswas, Manish Patil, Rahul Shah, and Sharma V. Thankachan. Shared-Constraint Range Reporting. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 277-290, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{biswas_et_al:LIPIcs.ICDT.2015.277,
  author =	{Biswas, Sudip and Patil, Manish and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{Shared-Constraint Range Reporting}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{277--290},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.277},
  URN =		{urn:nbn:de:0030-drops-49900},
  doi =		{10.4230/LIPIcs.ICDT.2015.277},
  annote =	{Keywords: data structure, shared constraint, multi-slab, point partitioning}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.291

Optimal Broadcasting Strategies for Conjunctive Queries over Distributed Data

Authors: Bas Ketsman and Frank Neven

Abstract

In a distributed context where data is dispersed over many computing nodes, monotone queries can be evaluated in an eventually consistent and coordination-free manner through a simple but naive broadcasting strategy which makes all data available on every computing node. In this paper, we investigate more economical broadcasting strategies for full conjunctive queries without self-joins that only transmit a part of the local data necessary to evaluate the query at hand. We consider oblivious broadcasting strategies which determine which local facts to broadcast independent of the data at other computing nodes. We introduce the notion of broadcast dependency set (BDS) as a sound and complete formalism to represent locally optimal oblivious broadcasting functions. We provide algorithms to construct a BDS for a given conjunctive query and study the complexity of various decision problems related to these algorithms.

Cite as

Bas Ketsman and Frank Neven. Optimal Broadcasting Strategies for Conjunctive Queries over Distributed Data. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 291-307, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{ketsman_et_al:LIPIcs.ICDT.2015.291,
  author =	{Ketsman, Bas and Neven, Frank},
  title =	{{Optimal Broadcasting Strategies for Conjunctive Queries over Distributed Data}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{291--307},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.291},
  URN =		{urn:nbn:de:0030-drops-49913},
  doi =		{10.4230/LIPIcs.ICDT.2015.291},
  annote =	{Keywords: Coordination-free evaluation, conjunctive queries, broadcasting}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.308

Datalog Queries Distributing over Components

Authors: Tom J. Ameloot, Bas Ketsman, Frank Neven, and Daniel Zinn

Abstract

We investigate the class D of queries that distribute over components. These are the queries that can be evaluated by taking the union of the query results over the connected components of the database instance. We show that it is undecidable whether a (positive) Datalog program distributes over components. Additionally, we show that connected Datalog with Negation (the fragment of Datalog with Negation where all rules are connected) provides an effective syntax for Datalog with Negation programs that distribute over components under the stratified as well as under the well-founded semantics. As a corollary, we obtain a simple proof for one of the main results in previous work [Zinn, Green, and Ludäscher, ICDT2012], namely, that the classic win-move query is in F_2 (a particular class of coordination-free queries).

Cite as

Tom J. Ameloot, Bas Ketsman, Frank Neven, and Daniel Zinn. Datalog Queries Distributing over Components. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 308-323, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{ameloot_et_al:LIPIcs.ICDT.2015.308,
  author =	{Ameloot, Tom J. and Ketsman, Bas and Neven, Frank and Zinn, Daniel},
  title =	{{Datalog Queries Distributing over Components}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{308--323},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.308},
  URN =		{urn:nbn:de:0030-drops-49920},
  doi =		{10.4230/LIPIcs.ICDT.2015.308},
  annote =	{Keywords: Datalog, stratified semantics, well-founded semantics, coordination-free evaluation, distributed databases}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.324

Distributed Streaming with Finite Memory

Authors: Frank Neven, Nicole Schweikardt, Frédéric Servais, and Tony Tan

Abstract

We introduce three formal models of distributed systems for query evaluation on massive databases: Distributed Streaming with Register Automata (DSAs), Distributed Streaming with Register Transducers (DSTs), and Distributed Streaming with Register Transducers and Joins (DSTJs). These models are based on the key-value paradigm where the input is transformed into a dataset of key-value pairs, and on each key a local computation is performed on the values associated with that key resulting in another set of key-value pairs. Computation proceeds in a constant number of rounds, where the result of the last round is the input to the next round, and transformation to key-value pairs is required to be generic. The difference between the three models is in the local computation part. In DSAs it is limited to making one pass over its input using a register automaton, while in DSTs it can make two passes: in the first pass it uses a finite-state automaton and in the second it uses a register transducer. The third model DSTJs is an extension of DSTs, where local computations are capable of constructing the Cartesian product of two sets. We obtain the following results: (1) DSAs can evaluate first-order queries over bounded degree databases; (2) DSTs can evaluate semijoin algebra queries over arbitrary databases; (3) DSTJs can evaluate the whole relational algebra over arbitrary databases; (4) DSTJs are strictly stronger than DSTs, which in turn, are strictly stronger than DSAs; (5) within DSAs, DSTs and DSTJs there is a strict hierarchy w.r.t. the number of rounds.

Cite as

Frank Neven, Nicole Schweikardt, Frédéric Servais, and Tony Tan. Distributed Streaming with Finite Memory. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 324-341, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{neven_et_al:LIPIcs.ICDT.2015.324,
  author =	{Neven, Frank and Schweikardt, Nicole and Servais, Fr\'{e}d\'{e}ric and Tan, Tony},
  title =	{{Distributed Streaming with Finite Memory}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{324--341},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.324},
  URN =		{urn:nbn:de:0030-drops-49939},
  doi =		{10.4230/LIPIcs.ICDT.2015.324},
  annote =	{Keywords: distributed systems, relational algebra, semijoin algebra, register automata, register transducers.}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.342

From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back

Authors: Babak Salimi and Leopoldo Bertossi

Abstract

In this work we establish and investigate connections between causality for query answers in databases, database repairs wrt. denial constraints, and consistency-based diagnosis. The first two are relatively new problems in databases, and the third one is an established subject in knowledge representation. We show how to obtain database repairs from causes and the other way around. Causality problems are formulated as diagnosis problems, and the diagnoses provide causes and their responsibilities. The vast body of research on database repairs can be applied to the newer problem of determining actual causes for query answers and their responsibilities. These connections, which are interesting per se, allow us, after a transition-inspired by consistency-based diagnosis- to computational problems on hitting sets and vertex covers in hypergraphs, to obtain several new algorithmic and complexity results for database causality.

Cite as

Babak Salimi and Leopoldo Bertossi. From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 342-362, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{salimi_et_al:LIPIcs.ICDT.2015.342,
  author =	{Salimi, Babak and Bertossi, Leopoldo},
  title =	{{From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{342--362},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.342},
  URN =		{urn:nbn:de:0030-drops-49948},
  doi =		{10.4230/LIPIcs.ICDT.2015.342},
  annote =	{Keywords: causality,diagnosis,repairs,consistent query answering,integrity constraints}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.363

On the Relationship between Consistent Query Answering and Constraint Satisfaction Problems

Authors: Carsten Lutz and Frank Wolter

Abstract

Recently, Fontaine has pointed out a connection between consistent query answering (CQA) and constraint satisfaction problems (CSP) [Fontaine, LICS 2013]. We investigate this connection more closely, identifying classes of CQA problems based on denial constraints and GAV constraints that correspond exactly to CSPs in the sense that a complexity classification of the CQA problems in each class is equivalent (up to FO-reductions) to classifying the complexity of all CSPs. We obtain these classes by admitting only monadic relations and only a single variable in denial constraints/GAVs and restricting queries to hypertree UCQs. We also observe that dropping the requirement of UCQs to be hypertrees corresponds to transitioning from CSP to its logical generalization MMSNP and identify a further relaxation that corresponds to transitioning from MMSNP to GMSNP (also know as MMSNP_2). Moreover, we use the CSP connection to carry over decidability of FO-rewritability and Datalog-rewritability to some of the identified classes of CQA problems.

Cite as

Carsten Lutz and Frank Wolter. On the Relationship between Consistent Query Answering and Constraint Satisfaction Problems. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 363-379, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{lutz_et_al:LIPIcs.ICDT.2015.363,
  author =	{Lutz, Carsten and Wolter, Frank},
  title =	{{On the Relationship between Consistent Query Answering and Constraint Satisfaction Problems}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{363--379},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.363},
  URN =		{urn:nbn:de:0030-drops-49958},
  doi =		{10.4230/LIPIcs.ICDT.2015.363},
  annote =	{Keywords: Consistent Query Answering, Constraint Satisfaction, Data Complexity, Dichotomies, Rewritability}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.380

On the Data Complexity of Consistent Query Answering over Graph Databases

Authors: Pablo Barceló and Gaëlle Fontaine

Abstract

Areas in which graph databases are applied - such as the semantic web, social networks and scientific databases - are prone to inconsistency, mainly due to interoperability issues. This raises the need for understanding query answering over inconsistent graph databases in a framework that is simple yet general enough to accommodate many of its applications. We follow the well-known approach of consistent query answering (CQA), and study the data complexity of CQA over graph databases for regular path queries (RPQs) and regular path constraints (RPCs), which are frequently used. We concentrate on subset, superset and symmetric difference repairs. Without further restrictions, CQA is undecidable for the semantics based on superset and symmetric difference repairs, and Pi_2^P-complete for subset repairs. However, we provide several tractable restrictions on both RPCs and the structure of graph databases that lead to decidability, and even tractability of CQA. We also compare our results with those obtained for CQA in the context of relational databases.

Cite as

Pablo Barceló and Gaëlle Fontaine. On the Data Complexity of Consistent Query Answering over Graph Databases. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 380-397, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2015.380,
  author =	{Barcel\'{o}, Pablo and Fontaine, Ga\"{e}lle},
  title =	{{On the Data Complexity of Consistent Query Answering over Graph Databases}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{380--397},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.380},
  URN =		{urn:nbn:de:0030-drops-49962},
  doi =		{10.4230/LIPIcs.ICDT.2015.380},
  annote =	{Keywords: graph databases, regular path queries, consistent query answering, description logics, rewrite systems}
}

LIPIcs, Volume 31

18th International Conference on Database Theory (ICDT 2015)

Event

Editors

Publication Details

Access Numbers

Documents

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Filters

Thanks for your feedback!

Could not send message