DROPS

Document

Different Differences in Semirings

Authors: Dan Suciu

Published in: OASIcs, Volume 119, The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen (2024)

Abstract

Relational algebra operates over relations under either set semantics or bag semantics. In 2007 Val Tannen extended the semantics of relational algebra to K-relations, where each tuple is annotated with a value from a semiring. However, only the positive fragment of the relational algebra can be interpreted over K-relations. The reason is that a semiring contains only the operations addition and multiplication, and does not have a difference operation. This paper explores three ways of adding a difference operator to a semiring: as a freely generated algebra, by using the natural order, or by an explicit construction using products and quotients. The paper consists of both a survey of results from the literature, and of some novel results.

Cite as

Dan Suciu. Different Differences in Semirings. In The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen. Open Access Series in Informatics (OASIcs), Volume 119, pp. 10:1-10:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{suciu:OASIcs.Tannen.10,
  author =	{Suciu, Dan},
  title =	{{Different Differences in Semirings}},
  booktitle =	{The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen},
  pages =	{10:1--10:20},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-320-1},
  ISSN =	{2190-6807},
  year =	{2024},
  volume =	{119},
  editor =	{Amarilli, Antoine and Deutsch, Alin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Tannen.10},
  URN =		{urn:nbn:de:0030-drops-201062},
  doi =		{10.4230/OASIcs.Tannen.10},
  annote =	{Keywords: Semirings, K-relations}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2023.8

Degree Sequence Bound for Join Cardinality Estimation

Authors: Kyle Deeds, Dan Suciu, Magda Balazinska, and Walter Cai

Published in: LIPIcs, Volume 255, 26th International Conference on Database Theory (ICDT 2023)

Abstract

Recent work has demonstrated the catastrophic effects of poor cardinality estimates on query processing time. In particular, underestimating query cardinality can result in overly optimistic query plans which take orders of magnitude longer to complete than one generated with the true cardinality. Cardinality bounding avoids this pitfall by computing an upper bound on the query’s output size using statistics about the database such as table sizes and degrees, i.e. value frequencies. In this paper, we extend this line of work by proving a novel bound called the Degree Sequence Bound which takes into account the full degree sequences and the max tuple multiplicity. This work focuses on the important class of Berge-Acyclic queries for which the Degree Sequence Bound is tight. Further, we describe how to practically compute this bound using a functional approximation of the true degree sequences and prove that even this functional form improves upon previous bounds.

Cite as

Kyle Deeds, Dan Suciu, Magda Balazinska, and Walter Cai. Degree Sequence Bound for Join Cardinality Estimation. In 26th International Conference on Database Theory (ICDT 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 255, pp. 8:1-8:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{deeds_et_al:LIPIcs.ICDT.2023.8,
  author =	{Deeds, Kyle and Suciu, Dan and Balazinska, Magda and Cai, Walter},
  title =	{{Degree Sequence Bound for Join Cardinality Estimation}},
  booktitle =	{26th International Conference on Database Theory (ICDT 2023)},
  pages =	{8:1--8:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-270-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{255},
  editor =	{Geerts, Floris and Vandevoort, Brecht},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2023.8},
  URN =		{urn:nbn:de:0030-drops-177508},
  doi =		{10.4230/LIPIcs.ICDT.2023.8},
  annote =	{Keywords: Cardinality Estimation, Cardinality Bounding, Degree Bounds, Functional Approximation, Query Planning, Berge-Acyclic Queries}
}

Document

DOI: 10.4230/DagRep.12.7.180

Algorithmic Aspects of Information Theory (Dagstuhl Seminar 22301)

Authors: Phokion G. Kolaitis, Andrej E. Romashchenko, Milan Studený, Dan Suciu, and Tobias A. Boege

Published in: Dagstuhl Reports, Volume 12, Issue 7 (2023)

Abstract

This report documents the program and the outcomes of Dagstuhl Seminar 22301 "Algorithmic Aspects of Information Theory". Constraints on entropies constitute the "laws of information theory". These constraints go well beyond Shannon’s basic information inequalities, as they include not only information inequalities that cannot be derived from Shannon’s basic inequalities, but also conditional inequalities and disjunctive inequalities that are valid for all entropic functions. There is an extensive body of research on constraints on entropies and their applications to different areas of mathematics and computer science. So far, however, little progress has been made on the algorithmic aspects of information theory. In fact, even fundamental questions about the decidability of information inequalities and their variants have remained open to date. Recently, research in different applications has demonstrated a clear need for algorithmic solutions to questions in information theory. These applications include: finding tight upper bounds on the answer to a query on a relational database, the homomorphism domination problem and its uses in query optimization, the conditional independence implication problem, soft constraints in databases, group-theoretic inequalities, and lower bounds on the information ratio in secret sharing. Thus far, the information-theory community has had little interaction with the communities where these applications have been studied or with the computational complexity community. The main goal of this Dagstuhl Seminar was to bring together researchers from the aforementioned communities and to develop an agenda for studying algorithmic aspects of information theory, motivated from a rich set of diverse applications. By using the algorithmic lens to examine the common problems and by transferring techniques from one community to the other, we expected that bridges would be created and some tangible progress on open questions could be made.

Cite as

Phokion G. Kolaitis, Andrej E. Romashchenko, Milan Studený, Dan Suciu, and Tobias A. Boege. Algorithmic Aspects of Information Theory (Dagstuhl Seminar 22301). In Dagstuhl Reports, Volume 12, Issue 7, pp. 180-204, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@Article{kolaitis_et_al:DagRep.12.7.180,
  author =	{Kolaitis, Phokion G. and Romashchenko, Andrej E. and Studen\'{y}, Milan and Suciu, Dan and Boege, Tobias A.},
  title =	{{Algorithmic Aspects of Information Theory (Dagstuhl Seminar 22301)}},
  pages =	{180--204},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2023},
  volume =	{12},
  number =	{7},
  editor =	{Kolaitis, Phokion G. and Romashchenko, Andrej E. and Studen\'{y}, Milan and Suciu, Dan and Boege, Tobias A.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.12.7.180},
  URN =		{urn:nbn:de:0030-drops-176155},
  doi =		{10.4230/DagRep.12.7.180},
  annote =	{Keywords: Information theory, Information inequalities, Conditional independence structures, Database query evaluation and containment, Decision problems}
}

Document

Track B: Automata, Logic, Semantics, and Theory of Programming

DOI: 10.4230/LIPIcs.ICALP.2020.106

Decision Problems in Information Theory

Authors: Mahmoud Abo Khamis, Phokion G. Kolaitis, Hung Q. Ngo, and Dan Suciu

Published in: LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)

Abstract

Constraints on entropies are considered to be the laws of information theory. Even though the pursuit of their discovery has been a central theme of research in information theory, the algorithmic aspects of constraints on entropies remain largely unexplored. Here, we initiate an investigation of decision problems about constraints on entropies by placing several different such problems into levels of the arithmetical hierarchy. We establish the following results on checking the validity over all almost-entropic functions: first, validity of a Boolean information constraint arising from a monotone Boolean formula is co-recursively enumerable; second, validity of "tight" conditional information constraints is in Π⁰₃. Furthermore, under some restrictions, validity of conditional information constraints "with slack" is in Σ⁰₂, and validity of information inequality constraints involving max is Turing equivalent to validity of information inequality constraints (with no max involved). We also prove that the classical implication problem for conditional independence statements is co-recursively enumerable.

Cite as

Mahmoud Abo Khamis, Phokion G. Kolaitis, Hung Q. Ngo, and Dan Suciu. Decision Problems in Information Theory. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 106:1-106:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{abokhamis_et_al:LIPIcs.ICALP.2020.106,
  author =	{Abo Khamis, Mahmoud and Kolaitis, Phokion G. and Ngo, Hung Q. and Suciu, Dan},
  title =	{{Decision Problems in Information Theory}},
  booktitle =	{47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)},
  pages =	{106:1--106:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-138-2},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{168},
  editor =	{Czumaj, Artur and Dawar, Anuj and Merelli, Emanuela},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2020.106},
  URN =		{urn:nbn:de:0030-drops-125137},
  doi =		{10.4230/LIPIcs.ICALP.2020.106},
  annote =	{Keywords: Information theory, decision problems, arithmetical hierarchy, entropic functions}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2020.18

Integrity Constraints Revisited: From Exact to Approximate Implication

Authors: Batya Kenig and Dan Suciu

Published in: LIPIcs, Volume 155, 23rd International Conference on Database Theory (ICDT 2020)

Abstract

Integrity constraints such as functional dependencies (FD), and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Finally, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Our results recover, and sometimes extend, several previously known results about the implication problem: implication of MVDs can be checked by considering only 2-tuple relations, and the implication of differential constraints for frequent item sets can be checked by considering only databases containing a single transaction.

Cite as

Batya Kenig and Dan Suciu. Integrity Constraints Revisited: From Exact to Approximate Implication. In 23rd International Conference on Database Theory (ICDT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 155, pp. 18:1-18:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{kenig_et_al:LIPIcs.ICDT.2020.18,
  author =	{Kenig, Batya and Suciu, Dan},
  title =	{{Integrity Constraints Revisited: From Exact to Approximate Implication}},
  booktitle =	{23rd International Conference on Database Theory (ICDT 2020)},
  pages =	{18:1--18:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-139-9},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{155},
  editor =	{Lutz, Carsten and Jung, Jean Christoph},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2020.18},
  URN =		{urn:nbn:de:0030-drops-119420},
  doi =		{10.4230/LIPIcs.ICDT.2020.18},
  annote =	{Keywords: Integrity constraints, The implication problem}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2019.21

Boolean Tensor Decomposition for Conjunctive Queries with Negation

Authors: Mahmoud Abo Khamis, Hung Q. Ngo, Dan Olteanu, and Dan Suciu

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)

Abstract

We propose an approach for answering conjunctive queries with negation, where the negated relations have bounded degree. Its data complexity matches that of the InsideOut and PANDA algorithms for the positive subquery of the input query and is expressed in terms of the fractional hypertree width and the submodular width respectively. Its query complexity depends on the structure of the conjunction of negated relations; in general it is exponential in the number of join variables occurring in negated relations yet it becomes polynomial for several classes of queries. This approach relies on several contributions. We show how to rewrite queries with negation on bounded-degree relations into equivalent conjunctive queries with not-all-equal (NAE) predicates, which are a multi-dimensional analog of disequality (!=). We then generalize the known color-coding technique to conjunctions of NAE predicates and explain it via a Boolean tensor decomposition of conjunctions of NAE predicates. This decomposition can be achieved via a probabilistic construction that can be derandomized efficiently.

Cite as

Mahmoud Abo Khamis, Hung Q. Ngo, Dan Olteanu, and Dan Suciu. Boolean Tensor Decomposition for Conjunctive Queries with Negation. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 21:1-21:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{abokhamis_et_al:LIPIcs.ICDT.2019.21,
  author =	{Abo Khamis, Mahmoud and Ngo, Hung Q. and Olteanu, Dan and Suciu, Dan},
  title =	{{Boolean Tensor Decomposition for Conjunctive Queries with Negation}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  pages =	{21:1--21:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.21},
  URN =		{urn:nbn:de:0030-drops-103236},
  doi =		{10.4230/LIPIcs.ICDT.2019.21},
  annote =	{Keywords: color-coding, combined complexity, negation, query evaluation}
}

Document

DOI: 10.4230/DagMan.7.1.1

Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)

Authors: Serge Abiteboul, Marcelo Arenas, Pablo Barceló, Meghyn Bienvenu, Diego Calvanese, Claire David, Richard Hull, Eyke Hüllermeier, Benny Kimelfeld, Leonid Libkin, Wim Martens, Tova Milo, Filip Murlak, Frank Neven, Magdalena Ortiz, Thomas Schwentick, Julia Stoyanovich, Jianwen Su, Dan Suciu, Victor Vianu, and Ke Yi

Published in: Dagstuhl Manifestos, Volume 7, Issue 1 (2018)

Abstract

The area of Principles of Data Management (PDM) has made crucial contributions to the development of formal frameworks for understanding and managing data and knowledge. This work has involved a rich cross-fertilization between PDM and other disciplines in mathematics and computer science, including logic, complexity theory, and knowledge representation. We anticipate on-going expansion of PDM research as the technology and applications involving data management continue to grow and evolve. In particular, the lifecycle of Big Data Analytics raises a wealth of challenge areas that PDM can help with. In this report we identify some of the most important research directions where the PDM community has the potential to make significant contributions. This is done from three perspectives: potential practical relevance, results already obtained, and research questions that appear surmountable in the short and medium term.

Cite as

Serge Abiteboul, Marcelo Arenas, Pablo Barceló, Meghyn Bienvenu, Diego Calvanese, Claire David, Richard Hull, Eyke Hüllermeier, Benny Kimelfeld, Leonid Libkin, Wim Martens, Tova Milo, Filip Murlak, Frank Neven, Magdalena Ortiz, Thomas Schwentick, Julia Stoyanovich, Jianwen Su, Dan Suciu, Victor Vianu, and Ke Yi. Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151). In Dagstuhl Manifestos, Volume 7, Issue 1, pp. 1-29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@Article{abiteboul_et_al:DagMan.7.1.1,
  author =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  title =	{{Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)}},
  pages =	{1--29},
  journal =	{Dagstuhl Manifestos},
  ISSN =	{2193-2433},
  year =	{2018},
  volume =	{7},
  number =	{1},
  editor =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagMan.7.1.1},
  URN =		{urn:nbn:de:0030-drops-86772},
  doi =		{10.4230/DagMan.7.1.1},
  annote =	{Keywords: database theory, principles of data management, query languages, efficient query processing, query optimization, heterogeneous data, uncertainty, knowledge-enriched data management, machine learning, workflows, human-related data, ethics}
}

@Article{abiteboul_et_al:DagMan.7.1.1,
  author =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  title =	{{Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)}},
  pages =	{1--29},
  journal =	{Dagstuhl Manifestos},
  ISSN =	{2193-2433},
  year =	{2018},
  volume =	{7},
  number =	{1},
  editor =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagMan.7.1.1},
  URN =		{urn:nbn:de:0030-drops-86772},
  doi =		{10.4230/DagMan.7.1.1},
  annote =	{Keywords: database theory, principles of data management, query languages, efficient query processing, query optimization, heterogeneous data, uncertainty, knowledge-enriched data management, machine learning, workflows, human-related data, ethics}
}

Document

DOI: 10.4230/DagRep.7.9.62

Recent Trends in Knowledge Compilation (Dagstuhl Seminar 17381)

Authors: Adnan Darwiche, Pierre Marquis, Dan Suciu, and Stefan Szeider

Published in: Dagstuhl Reports, Volume 7, Issue 9 (2018)

Abstract

Knowledge compilation (KC) is a research topic which aims to investigate the possibility of circumventing the computational intractability of hard tasks, by preprocessing part of the available information, common to a number of instances. Pioneered almost three decades ago, KC is nowadays a very active research field, transversal to several areas within computer science. Among others, KC intersects knowledge representation, constraint satisfaction, algorithms, complexity theory, machine learning, and databases. The results obtained so far take various forms, from theory (compilability settings, definition of target languages for KC, complexity results, succinctness results, etc.) to more practical results (development and evaluation of compilers and other preprocessors, applications to diagnosis, planning, automatic configuration, etc.). Recently, KC has been positioned as providing a systematic method for solving problems beyond NP, and also found applications in machine learning. The goal of this Dagstuhl Seminar was to advance both aspects of KC, and to pave the way for a fruitful cross-fertilization between the topics, from theory to practice. The program included a mixture of long and short presentations, with discussions. Several long talks with a tutorial flavor introduced the participants to the variety of aspects in knowledge compilation and the diversity of techniques used. System presentations as well as an open problem session were also included in the program.

Cite as

Adnan Darwiche, Pierre Marquis, Dan Suciu, and Stefan Szeider. Recent Trends in Knowledge Compilation (Dagstuhl Seminar 17381). In Dagstuhl Reports, Volume 7, Issue 9, pp. 62-85, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@Article{darwiche_et_al:DagRep.7.9.62,
  author =	{Darwiche, Adnan and Marquis, Pierre and Suciu, Dan and Szeider, Stefan},
  title =	{{Recent Trends in Knowledge Compilation (Dagstuhl Seminar 17381)}},
  pages =	{62--85},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2018},
  volume =	{7},
  number =	{9},
  editor =	{Darwiche, Adnan and Marquis, Pierre and Suciu, Dan and Szeider, Stefan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.7.9.62},
  URN =		{urn:nbn:de:0030-drops-85896},
  doi =		{10.4230/DagRep.7.9.62},
  annote =	{Keywords: Knowledge compilation, Constraints, Preprocessing, Probabilistic databases, Model counting}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2016.8

Worst-Case Optimal Algorithms for Parallel Query Processing

Authors: Paraschos Koutris, Paul Beame, and Dan Suciu

Published in: LIPIcs, Volume 48, 19th International Conference on Database Theory (ICDT 2016)

Abstract

In this paper, we study the communication complexity for the problem of computing a conjunctive query on a large database in a parallel setting with p servers. In contrast to previous work, where upper and lower bounds on the communication were specified for particular structures of data (either data without skew, or data with specific types of skew), in this work we focus on worst-case analysis of the communication cost. The goal is to find worst-case optimal parallel algorithms, similar to the work of (Ngo et al. 2012) for sequential algorithms. We first show that for a single round we can obtain an optimal worst-case algorithm. The optimal load for a conjunctive query q when all relations have size equal to M is O(M/p^{1/psi^*}), where psi^* is a new query-related quantity called the edge quasi-packing number, which is different from both the edge packing number and edge cover number of the query hypergraph. For multiple rounds, we present algorithms that are optimal for several classes of queries. Finally, we show a surprising connection to the external memory model, which allows us to translate parallel algorithms to external memory algorithms. This technique allows us to recover (within a polylogarithmic factor) several recent results on the I/O complexity for computing join queries, and also obtain optimal algorithms for other classes of queries.

Cite as

Paraschos Koutris, Paul Beame, and Dan Suciu. Worst-Case Optimal Algorithms for Parallel Query Processing. In 19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, pp. 8:1-8:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{koutris_et_al:LIPIcs.ICDT.2016.8,
  author =	{Koutris, Paraschos and Beame, Paul and Suciu, Dan},
  title =	{{Worst-Case Optimal Algorithms for Parallel Query Processing}},
  booktitle =	{19th International Conference on Database Theory (ICDT 2016)},
  pages =	{8:1--8:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-002-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{48},
  editor =	{Martens, Wim and Zeume, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2016.8},
  URN =		{urn:nbn:de:0030-drops-57771},
  doi =		{10.4230/LIPIcs.ICDT.2016.8},
  annote =	{Keywords: conjunctive query, parallel computation, worst-case bounds}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.76

Answering Conjunctive Queries with Inequalities

Authors: Paraschos Koutris, Tova Milo, Sudeepa Roy, and Dan Suciu

Published in: LIPIcs, Volume 31, 18th International Conference on Database Theory (ICDT 2015)

Abstract

In this parer, we study the complexity of answering conjunctive queries (CQ) with inequalities. In particular, we compare the complexity of the query with and without inequalities. The main contribution of our work is a novel combinatorial technique that enables the use of any Select-Project-Join query plan for a given CQ without inequalities in answering the CQ with inequalities, with an additional factor in running time that only depends on the query. To achieve this, we define a new projection operator that keeps a small representation (independent of the size of the database) of the set of input tuples that map to each tuple in the output of the projection; this representation is used to evaluate all the inequalities in the query. Second, we generalize a result by Papadimitriou-Yannakakis [PODS'97] and give an alternative algorithm based on the color-coding technique [Alon, Yuster and Zwick, PODS'02] to evaluate a CQ with inequalities by using an algorithm for the CQ without inequalities. Third, we investigate the structure of the query graph, inequality graph, and the augmented query graph with inequalities, and show that even if the query and the inequality graphs have bounded treewidth, the augmented graph not only can have an unbounded treewidth but can also be NP-hard to evaluate. Further, we illustrate classes of queries and inequalities where the augmented graphs have unbounded treewidth, but the CQ with inequalities can be evaluated in poly-time. Finally, we give necessary properties and sufficient properties that allow a class of CQs to have poly-time combined complexity with respect to any inequality pattern.

Cite as

Paraschos Koutris, Tova Milo, Sudeepa Roy, and Dan Suciu. Answering Conjunctive Queries with Inequalities. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 76-93, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{koutris_et_al:LIPIcs.ICDT.2015.76,
  author =	{Koutris, Paraschos and Milo, Tova and Roy, Sudeepa and Suciu, Dan},
  title =	{{Answering Conjunctive Queries with Inequalities}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{76--93},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.76},
  URN =		{urn:nbn:de:0030-drops-49781},
  doi =		{10.4230/LIPIcs.ICDT.2015.76},
  annote =	{Keywords: query evaluation, conjunctive query, inequality, treewidth}
}

Document

DOI: 10.4230/DagSemProc.05061.1

05061 Abstracts Collection – Foundations of Semistructured Data

Authors: Frank Neven, Thomas Schwentick, and Dan Suciu

Published in: Dagstuhl Seminar Proceedings, Volume 5061, Foundations of Semistructured Data (2005)

Abstract

From 06.02.05 to 11.02.05, the Dagstuhl Seminar 05061 ``Foundations of Semistructured Data'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available.

Cite as

Frank Neven, Thomas Schwentick, and Dan Suciu. 05061 Abstracts Collection – Foundations of Semistructured Data. In Foundations of Semistructured Data. Dagstuhl Seminar Proceedings, Volume 5061, pp. 1-13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2005)

Copy BibTex To Clipboard

@InProceedings{neven_et_al:DagSemProc.05061.1,
  author =	{Neven, Frank and Schwentick, Thomas and Suciu, Dan},
  title =	{{05061 Abstracts Collection – Foundations of Semistructured Data}},
  booktitle =	{Foundations of Semistructured Data},
  pages =	{1--13},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2005},
  volume =	{5061},
  editor =	{Frank Neven and Thomas Schwentick and Dan Suciu},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.05061.1},
  URN =		{urn:nbn:de:0030-drops-2330},
  doi =		{10.4230/DagSemProc.05061.1},
  annote =	{Keywords: Semistructured data, XML, database theory, document processing}
}

Document

DOI: 10.4230/DagSemProc.05061.2

05061 Summary – Foundations of Semi-structured Data

Authors: Frank Neven, Thomas Schwentick, and Dan Suciu

Published in: Dagstuhl Seminar Proceedings, Volume 5061, Foundations of Semistructured Data (2005)

Abstract

As in the first seminar on this topic, the aim o the workshop was to bring together people from the areas related to semi-structured data. However, besides the presentation of recent work, this time the main goal was to identify the main lines of a common framework for future foundational work on semi-structured data. These lines of research are summarized below. The workshop was of a very interdisciplinary nature with invitees from databases, structured documents, programming languages, information retrieval and formal language theory. Several of the lectures were presented by PhD students. We had four invited speakers and a panel on research evaluation. Due to strong connections between topics treated at this workshop, many of the participants initiated new cooperations and research projects.

Cite as

Frank Neven, Thomas Schwentick, and Dan Suciu. 05061 Summary – Foundations of Semi-structured Data. In Foundations of Semistructured Data. Dagstuhl Seminar Proceedings, Volume 5061, pp. 1-5, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2005)

Copy BibTex To Clipboard

@InProceedings{neven_et_al:DagSemProc.05061.2,
  author =	{Neven, Frank and Schwentick, Thomas and Suciu, Dan},
  title =	{{05061 Summary – Foundations of Semi-structured Data}},
  booktitle =	{Foundations of Semistructured Data},
  pages =	{1--5},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2005},
  volume =	{5061},
  editor =	{Frank Neven and Thomas Schwentick and Dan Suciu},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.05061.2},
  URN =		{urn:nbn:de:0030-drops-2276},
  doi =		{10.4230/DagSemProc.05061.2},
  annote =	{Keywords: Report, summary}
}

Document

DOI: 10.4230/DagSemRep.318

Foundations of Semistructured Data (Dagstuhl Seminar 01361)

Authors: Alberto Mendelzon, Thomas Schwentick, and Dan Suciu

Published in: Dagstuhl Seminar Reports. Dagstuhl Seminar Reports, Volume 1 (2021)

Abstract

Cite as

Alberto Mendelzon, Thomas Schwentick, and Dan Suciu. Foundations of Semistructured Data (Dagstuhl Seminar 01361). Dagstuhl Seminar Report 318, pp. 1-22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2002)

Copy BibTex To Clipboard

@TechReport{mendelzon_et_al:DagSemRep.318,
  author =	{Mendelzon, Alberto and Schwentick, Thomas and Suciu, Dan},
  title =	{{Foundations of Semistructured Data (Dagstuhl Seminar 01361)}},
  pages =	{1--22},
  ISSN =	{1619-0203},
  year =	{2002},
  type = 	{Dagstuhl Seminar Report},
  number =	{318},
  institution =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemRep.318},
  URN =		{urn:nbn:de:0030-drops-152026},
  doi =		{10.4230/DagSemRep.318},
}

Search Results

Documents authored by Suciu, Dan

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as