DROPS

Volume

LIPIcs, Volume 48

19th International Conference on Database Theory (ICDT 2016)

ICDT 2016, March 15-18, 2016, Bordeaux, France

Editors: Wim Martens and Thomas Zeume

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2023.1

A Researcher’s Digest of GQL (Invited Talk)

Authors: Nadime Francis, Amélie Gheerbrant, Paolo Guagliardo, Leonid Libkin, Victor Marsault, Wim Martens, Filip Murlak, Liat Peterfreund, Alexandra Rogova, and Domagoj Vrgoč

Published in: LIPIcs, Volume 255, 26th International Conference on Database Theory (ICDT 2023)

Abstract

GQL (Graph Query Language) is being developed as a new ISO standard for graph query languages to play the same role for graph databases as SQL plays for relational. In parallel, an extension of SQL for querying property graphs, SQL/PGQ, is added to the SQL standard; it shares the graph pattern matching functionality with GQL. Both standards (not yet published) are hard-to-understand specifications of hundreds of pages. The goal of this paper is to present a digest of the language that is easy for the research community to understand, and thus to initiate research on these future standards for querying graphs. The paper concentrates on pattern matching features shared by GQL and SQL/PGQ, as well as querying facilities of GQL.

Cite as

Nadime Francis, Amélie Gheerbrant, Paolo Guagliardo, Leonid Libkin, Victor Marsault, Wim Martens, Filip Murlak, Liat Peterfreund, Alexandra Rogova, and Domagoj Vrgoč. A Researcher’s Digest of GQL (Invited Talk). In 26th International Conference on Database Theory (ICDT 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 255, pp. 1:1-1:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{francis_et_al:LIPIcs.ICDT.2023.1,
  author =	{Francis, Nadime and Gheerbrant, Am\'{e}lie and Guagliardo, Paolo and Libkin, Leonid and Marsault, Victor and Martens, Wim and Murlak, Filip and Peterfreund, Liat and Rogova, Alexandra and Vrgo\v{c}, Domagoj},
  title =	{{A Researcher’s Digest of GQL}},
  booktitle =	{26th International Conference on Database Theory (ICDT 2023)},
  pages =	{1:1--1:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-270-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{255},
  editor =	{Geerts, Floris and Vandevoort, Brecht},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2023.1},
  URN =		{urn:nbn:de:0030-drops-177434},
  doi =		{10.4230/LIPIcs.ICDT.2023.1},
  annote =	{Keywords: GQL, Property Graph, Query Language, Graph Database, Pattern matching, Multi-Graph}
}

@InProceedings{francis_et_al:LIPIcs.ICDT.2023.1,
  author =	{Francis, Nadime and Gheerbrant, Am\'{e}lie and Guagliardo, Paolo and Libkin, Leonid and Marsault, Victor and Martens, Wim and Murlak, Filip and Peterfreund, Liat and Rogova, Alexandra and Vrgo\v{c}, Domagoj},
  title =	{{A Researcher’s Digest of GQL}},
  booktitle =	{26th International Conference on Database Theory (ICDT 2023)},
  pages =	{1:1--1:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-270-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{255},
  editor =	{Geerts, Floris and Vandevoort, Brecht},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2023.1},
  URN =		{urn:nbn:de:0030-drops-177434},
  doi =		{10.4230/LIPIcs.ICDT.2023.1},
  annote =	{Keywords: GQL, Property Graph, Query Language, Graph Database, Pattern matching, Multi-Graph}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2021.10

The Complexity of Aggregates over Extractions by Regular Expressions

Authors: Johannes Doleschal, Noa Bratman, Benny Kimelfeld, and Wim Martens

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

Regular expressions with capture variables, also known as "regex-formulas", extract relations of spans (intervals identified by their start and end indices) from text. In turn, the class of regular document spanners is the closure of the regex formulas under the Relational Algebra. We investigate the computational complexity of querying text by aggregate functions, such as sum, average, and quantile, on top of regular document spanners. To this end, we formally define aggregate functions over regular document spanners and analyze the computational complexity of exact and approximate computation. More precisely, we show that in a restricted case, all studied aggregate functions can be computed in polynomial time. In general, however, even though exact computation is intractable, some aggregates can still be approximated with fully polynomial-time randomized approximation schemes (FPRAS).

Cite as

Johannes Doleschal, Noa Bratman, Benny Kimelfeld, and Wim Martens. The Complexity of Aggregates over Extractions by Regular Expressions. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 10:1-10:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{doleschal_et_al:LIPIcs.ICDT.2021.10,
  author =	{Doleschal, Johannes and Bratman, Noa and Kimelfeld, Benny and Martens, Wim},
  title =	{{The Complexity of Aggregates over Extractions by Regular Expressions}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{10:1--10:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.10},
  URN =		{urn:nbn:de:0030-drops-137181},
  doi =		{10.4230/LIPIcs.ICDT.2021.10},
  annote =	{Keywords: Information extraction, document spanners, regular expressions, aggregation functions}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2020.3

Current Challenges in Graph Databases (Invited Talk)

Authors: Juan L. Reutter

Published in: LIPIcs, Volume 155, 23rd International Conference on Database Theory (ICDT 2020)

Abstract

As graph databases grow in popularity, decades of work in graph query languages and models are materialising in industry standards and in the construction of new graph database systems. However, this surge in graph systems has in turn opened up a series of new, interesting research problems related to graph databases. Our first set of problems has to do with more efficient ways of computing the answers of graph queries, specifically graph patterns, path queries, and combinations between them. Traditionally, researchers in graph databases have pointed out that relational systems are ill-equipped to process these types of queries, and if one looks at the performance of native graph database systems, there is clearly a lot of room for improvement. The talk focuses on two possible directions for improving the state of the art in graph query processing. The first is implementing worst-case optimal algorithms for processing graph patterns that traduce in relational queries with several joins. Some advances are already in development (see e.g. Nguyen, Dung, et al. "Join processing for graph patterns: An old dog with new tricks." GRADES'15. or Hogan, Aidan, et al. "A Worst-Case Optimal Join Algorithm for SPARQL." ISWC’19.), but we are still far from a full fledged solution: most algorithms require complex data structures, or need further support in terms of heuristics to select an order in which joins are processed. Second, we need to understand what is the best way of evaluating path queries (that is, finding all pairs of nodes connected by a path), in such a way that these results can be further integrated with other query results in a graph system pipeline. We already have complexity results regarding path computation and enumeration for different semantics of path queries (see e.g. Martens, Wim, and Tina Trautner. "Evaluation and enumeration problems for regular path queries." ICDT'18. or Bagan, Guillaume, Angela Bonifati, and Benoit Groz. "A trichotomy for regular simple path queries on graphs." PODS'13.), but still very little is known in terms of optimal processing of path queries when inside a tractable fragment. Our second set of problems is related to graph analytics, one of the current selling points of graph databases. Systems should be able to run more complex analytical queries involving tasks such as more complex path finding, centrality or clustering. It is also important to be able to run these algorithms not over native graphs, but perhaps over a certain set of nodes or edges previously selected by a graph query, and one may also want to pose further queries over the result of the analytics task. Finally, all of this should be done in an efficient way, specially in the prospect that graph databases may contain a huge amount of nodes. In this talk I will discuss possible approaches to perform these operations, covering aspects from the design of languages for graph analytics to efficient ways of processing them, and also comparing the expressive power of graph analytics solutions with other forms of graph computation.

Cite as

Juan L. Reutter. Current Challenges in Graph Databases (Invited Talk). In 23rd International Conference on Database Theory (ICDT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 155, p. 3:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{reutter:LIPIcs.ICDT.2020.3,
  author =	{Reutter, Juan L.},
  title =	{{Current Challenges in Graph Databases}},
  booktitle =	{23rd International Conference on Database Theory (ICDT 2020)},
  pages =	{3:1--3:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-139-9},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{155},
  editor =	{Lutz, Carsten and Jung, Jean Christoph},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2020.3},
  URN =		{urn:nbn:de:0030-drops-119272},
  doi =		{10.4230/LIPIcs.ICDT.2020.3},
  annote =	{Keywords: Graph databases, Join algorithms, path queries, graph analytics}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2020.8

Weight Annotation in Information Extraction

Authors: Johannes Doleschal, Benny Kimelfeld, Wim Martens, and Liat Peterfreund

Published in: LIPIcs, Volume 155, 23rd International Conference on Database Theory (ICDT 2020)

Abstract

The framework of document spanners abstracts the task of information extraction from text as a function that maps every document (a string) into a relation over the document’s spans (intervals identified by their start and end indices). For instance, the regular spanners are the closure under the Relational Algebra (RA) of the regular expressions with capture variables, and the expressive power of the regular spanners is precisely captured by the class of vset-automata - a restricted class of transducers that mark the endpoints of selected spans. In this work, we embark on the investigation of document spanners that can annotate extractions with auxiliary information such as confidence, support, and confidentiality measures. To this end, we adopt the abstraction of provenance semirings by Green et al., where tuples of a relation are annotated with the elements of a commutative semiring, and where the annotation propagates through the (positive) RA operators via the semiring operators. Hence, the proposed spanner extension, referred to as an annotator, maps every string into an annotated relation over the spans. As a specific instantiation, we explore weighted vset-automata that, similarly to weighted automata and transducers, attach semiring elements to transitions. We investigate key aspects of expressiveness, such as the closure under the positive RA, and key aspects of computational complexity, such as the enumeration of annotated answers and their ranked enumeration in the case of numeric semirings. For a number of these problems, fundamental properties of the underlying semiring, such as positivity, are crucial for establishing tractability.

Cite as

Johannes Doleschal, Benny Kimelfeld, Wim Martens, and Liat Peterfreund. Weight Annotation in Information Extraction. In 23rd International Conference on Database Theory (ICDT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 155, pp. 8:1-8:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{doleschal_et_al:LIPIcs.ICDT.2020.8,
  author =	{Doleschal, Johannes and Kimelfeld, Benny and Martens, Wim and Peterfreund, Liat},
  title =	{{Weight Annotation in Information Extraction}},
  booktitle =	{23rd International Conference on Database Theory (ICDT 2020)},
  pages =	{8:1--8:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-139-9},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{155},
  editor =	{Lutz, Carsten and Jung, Jean Christoph},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2020.8},
  URN =		{urn:nbn:de:0030-drops-119325},
  doi =		{10.4230/LIPIcs.ICDT.2020.8},
  annote =	{Keywords: Information extraction, regular document spanners, weighted automata, provenance semirings, K-relations}
}

Document

DOI: 10.4230/LIPIcs.STACS.2020.7

A Trichotomy for Regular Trail Queries

Authors: Wim Martens, Matthias Niewerth, and Tina Trautner

Published in: LIPIcs, Volume 154, 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)

Abstract

Regular path queries (RPQs) are an essential component of graph query languages. Such queries consider a regular expression r and a directed edge-labeled graph G and search for paths in G for which the sequence of labels is in the language of r. In order to avoid having to consider infinitely many paths, some database engines restrict such paths to be trails, that is, they only consider paths without repeated edges. In this paper we consider the evaluation problem for RPQs under trail semantics, in the case where the expression is fixed. We show that, in this setting, there exists a trichotomy. More precisely, the complexity of RPQ evaluation divides the regular languages into the finite languages, the class T_tract (for which the problem is tractable), and the rest. Interestingly, the tractable class in the trichotomy is larger than for the trichotomy for simple paths, discovered by Bagan et al. [Bagan et al., 2013]. In addition to this trichotomy result, we also study characterizations of the tractable class, its expressivity, the recognition problem, closure properties, and show how the decision problem can be extended to the enumeration problem, which is relevant to practice.

Cite as

Wim Martens, Matthias Niewerth, and Tina Trautner. A Trichotomy for Regular Trail Queries. In 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 154, pp. 7:1-7:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{martens_et_al:LIPIcs.STACS.2020.7,
  author =	{Martens, Wim and Niewerth, Matthias and Trautner, Tina},
  title =	{{A Trichotomy for Regular Trail Queries}},
  booktitle =	{37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)},
  pages =	{7:1--7:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-140-5},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{154},
  editor =	{Paul, Christophe and Bl\"{a}ser, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2020.7},
  URN =		{urn:nbn:de:0030-drops-118681},
  doi =		{10.4230/LIPIcs.STACS.2020.7},
  annote =	{Keywords: Regular languages, query languages, path queries, graph databases, databases, complexity, trails, simple paths}
}

Document

DOI: 10.4230/DagMan.7.1.1

Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)

Authors: Serge Abiteboul, Marcelo Arenas, Pablo Barceló, Meghyn Bienvenu, Diego Calvanese, Claire David, Richard Hull, Eyke Hüllermeier, Benny Kimelfeld, Leonid Libkin, Wim Martens, Tova Milo, Filip Murlak, Frank Neven, Magdalena Ortiz, Thomas Schwentick, Julia Stoyanovich, Jianwen Su, Dan Suciu, Victor Vianu, and Ke Yi

Published in: Dagstuhl Manifestos, Volume 7, Issue 1 (2018)

Abstract

The area of Principles of Data Management (PDM) has made crucial contributions to the development of formal frameworks for understanding and managing data and knowledge. This work has involved a rich cross-fertilization between PDM and other disciplines in mathematics and computer science, including logic, complexity theory, and knowledge representation. We anticipate on-going expansion of PDM research as the technology and applications involving data management continue to grow and evolve. In particular, the lifecycle of Big Data Analytics raises a wealth of challenge areas that PDM can help with. In this report we identify some of the most important research directions where the PDM community has the potential to make significant contributions. This is done from three perspectives: potential practical relevance, results already obtained, and research questions that appear surmountable in the short and medium term.

Cite as

Serge Abiteboul, Marcelo Arenas, Pablo Barceló, Meghyn Bienvenu, Diego Calvanese, Claire David, Richard Hull, Eyke Hüllermeier, Benny Kimelfeld, Leonid Libkin, Wim Martens, Tova Milo, Filip Murlak, Frank Neven, Magdalena Ortiz, Thomas Schwentick, Julia Stoyanovich, Jianwen Su, Dan Suciu, Victor Vianu, and Ke Yi. Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151). In Dagstuhl Manifestos, Volume 7, Issue 1, pp. 1-29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@Article{abiteboul_et_al:DagMan.7.1.1,
  author =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  title =	{{Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)}},
  pages =	{1--29},
  journal =	{Dagstuhl Manifestos},
  ISSN =	{2193-2433},
  year =	{2018},
  volume =	{7},
  number =	{1},
  editor =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagMan.7.1.1},
  URN =		{urn:nbn:de:0030-drops-86772},
  doi =		{10.4230/DagMan.7.1.1},
  annote =	{Keywords: database theory, principles of data management, query languages, efficient query processing, query optimization, heterogeneous data, uncertainty, knowledge-enriched data management, machine learning, workflows, human-related data, ethics}
}

@Article{abiteboul_et_al:DagMan.7.1.1,
  author =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  title =	{{Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)}},
  pages =	{1--29},
  journal =	{Dagstuhl Manifestos},
  ISSN =	{2193-2433},
  year =	{2018},
  volume =	{7},
  number =	{1},
  editor =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagMan.7.1.1},
  URN =		{urn:nbn:de:0030-drops-86772},
  doi =		{10.4230/DagMan.7.1.1},
  annote =	{Keywords: database theory, principles of data management, query languages, efficient query processing, query optimization, heterogeneous data, uncertainty, knowledge-enriched data management, machine learning, workflows, human-related data, ethics}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2018.14

Satisfiability for SCULPT-Schemas for CSV-Like Data

Authors: Johannes Doleschal, Wim Martens, Frank Neven, and Adam Witkowski

Published in: LIPIcs, Volume 98, 21st International Conference on Database Theory (ICDT 2018)

Abstract

SCULPT is a simple schema language inspired by the recent working effort towards a recommendation by the World Wide Web Consortium (W3C) for tabular data and metadata on the Web. In its core, a SCULPT schema consists of a set of rules where left-hand sides select sets of regions in the tabular data and the right-hand sides describe the contents of these regions. A document (divided in cells by row- and column-delimiters) then satisfies a schema if it satisfies every rule. In this paper, we study the satisfiability problem for SCULPT schemas. As SCULPT describes grid-like structures, satisfiability obviously becomes undecidable rather quickly even for very restricted schemas. We define a schema language called L-SCULPT (Lego SCULPT) that restricts the walking power of SCULPT by selecting rectangular shaped areas and only considers tables for which selected regions do not intersect. Depending on the axes used by L-SCULPT, we show that satisfiability is PTIME-complete or undecidable. One of the tractable fragments is practically useful as it extends the structural core of the current W3C proposal for schemas over tabular data. We therefore see how the navigational power of the W3C proposal can be extended while still retaining tractable satisfiability tests.

Cite as

Johannes Doleschal, Wim Martens, Frank Neven, and Adam Witkowski. Satisfiability for SCULPT-Schemas for CSV-Like Data. In 21st International Conference on Database Theory (ICDT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 98, pp. 14:1-14:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{doleschal_et_al:LIPIcs.ICDT.2018.14,
  author =	{Doleschal, Johannes and Martens, Wim and Neven, Frank and Witkowski, Adam},
  title =	{{Satisfiability for SCULPT-Schemas for CSV-Like Data}},
  booktitle =	{21st International Conference on Database Theory (ICDT 2018)},
  pages =	{14:1--14:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-063-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{98},
  editor =	{Kimelfeld, Benny and Amsterdamer, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2018.14},
  URN =		{urn:nbn:de:0030-drops-85969},
  doi =		{10.4230/LIPIcs.ICDT.2018.14},
  annote =	{Keywords: CSV, schema languages, semi-structured data}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2018.19

Evaluation and Enumeration Problems for Regular Path Queries

Authors: Wim Martens and Tina Trautner

Published in: LIPIcs, Volume 98, 21st International Conference on Database Theory (ICDT 2018)

Abstract

Regular path queries (RPQs) are a central component of graph databases. We investigate decision- and enumeration problems concerning the evaluation of RPQs under several semantics that have recently been considered: arbitrary paths, shortest paths, and simple paths. Whereas arbitrary and shortest paths can be enumerated in polynomial delay, the situation is much more intricate for simple paths. For instance, already the question if a given graph contains a simple path of a certain length has cases with highly non-trivial solutions and cases that are long-standing open problems. We study RPQ evaluation for simple paths from a parameterized complexity perspective and define a class of simple transitive expressions that is prominent in practice and for which we can prove a dichotomy for the evaluation problem. We observe that, even though simple path semantics is intractable for RPQs in general, it is feasible for the vast majority of RPQs that are used in practice. At the heart of our study on simple paths is a result of independent interest: the two disjoint paths problem in directed graphs is W[1]-hard if parameterized by the length of one of the two paths.

Cite as

Wim Martens and Tina Trautner. Evaluation and Enumeration Problems for Regular Path Queries. In 21st International Conference on Database Theory (ICDT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 98, pp. 19:1-19:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{martens_et_al:LIPIcs.ICDT.2018.19,
  author =	{Martens, Wim and Trautner, Tina},
  title =	{{Evaluation and Enumeration Problems for Regular Path Queries}},
  booktitle =	{21st International Conference on Database Theory (ICDT 2018)},
  pages =	{19:1--19:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-063-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{98},
  editor =	{Kimelfeld, Benny and Amsterdamer, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2018.19},
  URN =		{urn:nbn:de:0030-drops-85947},
  doi =		{10.4230/LIPIcs.ICDT.2018.19},
  annote =	{Keywords: graph databases, regular path queries, regular languages, parameterized complexity}
}

Document

Complete Volume

DOI: 10.4230/LIPIcs.ICDT.2016

LIPIcs, Volume 48, ICDT'16, Complete Volume

Authors: Wim Martens and Thomas Zeume

Published in: LIPIcs, Volume 48, 19th International Conference on Database Theory (ICDT 2016)

Abstract

LIPIcs, Volume 48, ICDT'16, Complete Volume

Cite as

19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@Proceedings{martens_et_al:LIPIcs.ICDT.2016,
  title =	{{LIPIcs, Volume 48, ICDT'16, Complete Volume}},
  booktitle =	{19th International Conference on Database Theory (ICDT 2016)},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-002-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{48},
  editor =	{Martens, Wim and Zeume, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2016},
  URN =		{urn:nbn:de:0030-drops-57991},
  doi =		{10.4230/LIPIcs.ICDT.2016},
  annote =	{Keywords: Database Management, Normal forms, Schema and subschema, Query languages, Query processing, Relational databases, Distributed databases, Heterogeneous Databases, Online Information Services,Miscellaneous – Privacy, Office Automation: Workflow management}
}

Document

Front Matter

DOI: 10.4230/LIPIcs.ICDT.2016.0

Front Matter, Table of Contents, Preface, Conference Organization, External Reviewers, List of Authors

Authors: Wim Martens and Thomas Zeume

Published in: LIPIcs, Volume 48, 19th International Conference on Database Theory (ICDT 2016)

Abstract

Front Matter, Table of Contents, Preface, Conference Organization, External Reviewers, List of Authors

Cite as

19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, pp. 0:i-0:xvi, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{martens_et_al:LIPIcs.ICDT.2016.0,
  author =	{Martens, Wim and Zeume, Thomas},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization, External Reviewers, List of Authors}},
  booktitle =	{19th International Conference on Database Theory (ICDT 2016)},
  pages =	{0:i--0:xvi},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-002-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{48},
  editor =	{Martens, Wim and Zeume, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2016.0},
  URN =		{urn:nbn:de:0030-drops-57940},
  doi =		{10.4230/LIPIcs.ICDT.2016.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization, External Reviewers, List of Authors}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2016.1

The ICDT 2016 Test of Time Award Announcement

Authors: Foto N. Afrati, Claire David, and Georg Gottlob

Published in: LIPIcs, Volume 48, 19th International Conference on Database Theory (ICDT 2016)

Abstract

We describe the 2016 ICDT Test of Time Award which is awarded to Chandra Chekuri and Anand Rajaraman for their 1997 ICDT paper on "Conjunctive Query Containment Revisited".

Cite as

Foto N. Afrati, Claire David, and Georg Gottlob. The ICDT 2016 Test of Time Award Announcement. In 19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, pp. 1:1-1:2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{afrati_et_al:LIPIcs.ICDT.2016.1,
  author =	{Afrati, Foto N. and David, Claire and Gottlob, Georg},
  title =	{{The ICDT 2016 Test of Time Award Announcement}},
  booktitle =	{19th International Conference on Database Theory (ICDT 2016)},
  pages =	{1:1--1:2},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-002-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{48},
  editor =	{Martens, Wim and Zeume, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2016.1},
  URN =		{urn:nbn:de:0030-drops-57938},
  doi =		{10.4230/LIPIcs.ICDT.2016.1},
  annote =	{Keywords: conjunctive query, treewidth, NP-hardness, rewriting}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2016.2

Scale Independence: Using Small Data to Answer Queries on Big Data (Invited Talk)

Authors: Floris Geerts

Published in: LIPIcs, Volume 48, 19th International Conference on Database Theory (ICDT 2016)

Abstract

Large datasets introduce challenges to the scalability of query answering. Given a query Q and a dataset D, it is often prohibitively costly to compute the query answers Q(D) when D is big. To this end, one may want to use heuristics, "quick and dirty" algorithms which return approximate answers. However, in many applications it is a must to find exact query answers. So, how can we efficiently compute Q(D) when D is big or when we only have limited resources? One idea is to find a small subset D_Q of D such that Q(D_Q)=Q(D) where the size of D_Q is independent of the size of the underlying dataset D. Intuitively, when such a D_Q can be found for a query Q, the query is said to be scale independent (Armbrust et al. 2011, Armbrust et al. 2013, Fan et al. 2014). Indeed, for answering such queries the size of the underlying database does not matter, i.e., query processing is independent of the scale of the database. In this talk, I will survey various formalisms that enable large classes of queries to be scale independent. These formalisms primarily rely on the availability of access constraints, a combination of indexes and cardinality constraints, on the data (Fan et al. 15, Fan et al. 14). We will take a closer look at how, in the presence of such constraints, queries can often be compiled into efficient query plans that access a bounded amount data (Cao et al. 2014, Fan et al. 2015), and how these techniques relate to query processing in the presence of access patterns (Benedikt et al. 2015, Benedikt et al. 2014, Deutsch et al. 2007). Finally, we illustrate that scale independent queries are quite common in practice and that they indeed can be efficiently answered on big datasets when access constraints are present (Cao et al. 2015, Cao et al. 2014).

Cite as

Floris Geerts. Scale Independence: Using Small Data to Answer Queries on Big Data (Invited Talk). In 19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, pp. 2:1-2:2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{geerts:LIPIcs.ICDT.2016.2,
  author =	{Geerts, Floris},
  title =	{{Scale Independence: Using Small Data to Answer Queries on Big Data}},
  booktitle =	{19th International Conference on Database Theory (ICDT 2016)},
  pages =	{2:1--2:2},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-002-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{48},
  editor =	{Martens, Wim and Zeume, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2016.2},
  URN =		{urn:nbn:de:0030-drops-57715},
  doi =		{10.4230/LIPIcs.ICDT.2016.2},
  annote =	{Keywords: Scale independence, Access constraints, Query processing}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2016.3

Top-k Indexes Made Small and Sweet (Invited Talk)

Authors: Yufei Tao

Published in: LIPIcs, Volume 48, 19th International Conference on Database Theory (ICDT 2016)

Abstract

Top-k queries have become extremely popular in the database community. Such a query, which is issued on a set of elements each carrying a real-valued weight, returns the k elements with the highest weights among all the elements that satisfy a predicate. As usual, an index structure is necessary to answer a query substantially faster than accessing the whole input set. The existing research on top-k queries can be classified in two categories. The first one, which is system-oriented, aims to devise indexes that are simple to understand and easy to implement. These indexes, typically designed with heuristics, are reasonably fast in practical applications, but do not necessarily offer strong performance guarantees - in other words, they are small but not sweet. The other category, which is theory-oriented, aims to develop indexes that promise attractive bounds on the space consumption and query overhead (sometimes also update cost). These indexes, unfortunately, are often excessively sophisticated in the adopted techniques, and are rarely applied in practice - they are sweet but not small. This talk will discuss the progress of an on-going project that strives to take down the barrier between the two categories, by crafting a framework for acquiring simple top-k indexes with excellent performance guarantees - namely, small and sweet. This is achieved with reductions that produce top-k indexes automatically from the existing data structures for conventional reporting queries on unweighted elements (i.e., finding all elements satisfying a predicate), and/or the existing data structures on top-1 queries. Our reductions promise nearly no performance deterioration with respect to those existing structures, are general enough to be applicable to a huge variety of top-k problems, and work in both the external memory model and the RAM model.

Cite as

Yufei Tao. Top-k Indexes Made Small and Sweet (Invited Talk). In 19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, p. 3:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{tao:LIPIcs.ICDT.2016.3,
  author =	{Tao, Yufei},
  title =	{{Top-k Indexes Made Small and Sweet}},
  booktitle =	{19th International Conference on Database Theory (ICDT 2016)},
  pages =	{3:1--3:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-002-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{48},
  editor =	{Martens, Wim and Zeume, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2016.3},
  URN =		{urn:nbn:de:0030-drops-57725},
  doi =		{10.4230/LIPIcs.ICDT.2016.3},
  annote =	{Keywords: Data Structures, Top-k, External Memory, RAM, Reductions}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2016.4

New Algorithms for Heavy Hitters in Data Streams (Invited Talk)

Authors: David P. Woodruff

Published in: LIPIcs, Volume 48, 19th International Conference on Database Theory (ICDT 2016)

Abstract

An old and fundamental problem in databases and data streams is that of finding the heavy hitters, also known as the top-k, most popular items, frequent items, elephants, or iceberg queries. There are several variants of this problem, which quantify what it means for an item to be frequent, including what are known as the l_1-heavy hitters and l_2-heavy hitters. There are a number of algorithmic solutions for these problems, starting with the work of Misra and Gries, as well as the CountMin and CountSketch data structures, among others. In this paper (accompanying an invited talk) we cover several recent results developed in this area, which improve upon the classical solutions to these problems. In particular, we develop new algorithms for finding l_1-heavy hitters and l_2-heavy hitters, with significantly less memory required than what was known, and which are optimal in a number of parameter regimes.

Cite as

David P. Woodruff. New Algorithms for Heavy Hitters in Data Streams (Invited Talk). In 19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, pp. 4:1-4:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{woodruff:LIPIcs.ICDT.2016.4,
  author =	{Woodruff, David P.},
  title =	{{New Algorithms for Heavy Hitters in Data Streams}},
  booktitle =	{19th International Conference on Database Theory (ICDT 2016)},
  pages =	{4:1--4:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-002-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{48},
  editor =	{Martens, Wim and Zeume, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2016.4},
  URN =		{urn:nbn:de:0030-drops-57739},
  doi =		{10.4230/LIPIcs.ICDT.2016.4},
  annote =	{Keywords: data streams, heavy hitters}
}

35 Search Results for "Martens, Wim"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message