DROPS

Document

BWT Indexes for Optimal Joins in Graph Databases

Authors: Diego Arroyuelo and Gonzalo Navarro

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)

Abstract

Graph databases represent data as a labeled directed graph, where the labels refer to properties that connect the entities represented by their source and target vertices. Queries feature, most prominently, sets of edges where source, target, and/or label can be variables; each instantiation of the variables where all the edges occur in the graph is a solution to the query. Worst-case-optimal algorithms to solve those queries have been devised, but they pose significant space requirements. This overhead has hindered the adoption of worst-case-optimal algorithms in real systems. We show that a representation of the graph based on the extended BWT (eBWT), where each edge is seen as an independent string of length 3 (source, label, target) supports worst-case-optimal algorithms while using almost no extra space on top of the raw data. We then show how the idea is generalized to the relational model, where the strings can be longer than 3 and several eBWTs are needed to obtain worst-case optimality. The aim to minimize the amount of space in that case leads to consider novel eBWT variants, where columns other than the last can be chosen. Finally, we show how the same graph representation can be used to solve other typical queries, like finding graph paths that match regular expressions.

Cite as

Diego Arroyuelo and Gonzalo Navarro. BWT Indexes for Optimal Joins in Graph Databases. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 14:1-14:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{arroyuelo_et_al:OASIcs.Manzini.14,
  author =	{Arroyuelo, Diego and Navarro, Gonzalo},
  title =	{{BWT Indexes for Optimal Joins in Graph Databases}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{14:1--14:19},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.14},
  URN =		{urn:nbn:de:0030-drops-239222},
  doi =		{10.4230/OASIcs.Manzini.14},
  annote =	{Keywords: Graph databases, Ring index, extended BWT, compact data structures}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2025.25

Approximating Klee’s Measure Problem and a Lower Bound for Union Volume Estimation

Authors: Karl Bringmann, Kasper Green Larsen, André Nusser, Eva Rotenberg, and Yanheng Wang

Published in: LIPIcs, Volume 332, 41st International Symposium on Computational Geometry (SoCG 2025)

Abstract

Union volume estimation is a classical algorithmic problem. Given a family of objects O₁,…,O_n ⊂ ℝ^d, we want to approximate the volume of their union. In the special case where all objects are boxes (also called hyperrectangles) this is known as Klee’s measure problem. The state-of-the-art (1+ε)-approximation algorithm [Karp, Luby, Madras '89] for union volume estimation as well as Klee’s measure problem in constant dimension d uses a total of O(n/ε²) queries of three types: (i) determine the volume of O_i; (ii) sample a point uniformly at random from O_i; and (iii) ask whether a given point is contained in O_i. First, we show that if an algorithm learns about the objects only through these types of queries, then Ω(n/ε²) queries are necessary. In this sense, the complexity of [Karp, Luby, Madras '89] is optimal. Our lower bound holds even if the objects are equiponderous axis-aligned polygons in ℝ², if the containment query allows arbitrary (not necessarily sampled) points, and if the algorithm can spend arbitrary time and space examining the query responses. Second, we provide a more efficient approximation algorithm for Klee’s measure problem, which improves the running time from O(n/ε²) to O((n+1/ε²) ⋅ log^{O(d)} (n)). We circumvent our lower bound by exploiting the geometry of boxes in various ways: (1) We sort the boxes into classes of similar shapes after inspecting their corner coordinates. (2) With orthogonal range searching, we show how to sample points from the union of boxes in each class, and how to merge samples from different classes. (3) We bound the amount of wasted work by arguing that most pairs of classes have a small intersection.

Cite as

Karl Bringmann, Kasper Green Larsen, André Nusser, Eva Rotenberg, and Yanheng Wang. Approximating Klee’s Measure Problem and a Lower Bound for Union Volume Estimation. In 41st International Symposium on Computational Geometry (SoCG 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 332, pp. 25:1-25:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{bringmann_et_al:LIPIcs.SoCG.2025.25,
  author =	{Bringmann, Karl and Larsen, Kasper Green and Nusser, Andr\'{e} and Rotenberg, Eva and Wang, Yanheng},
  title =	{{Approximating Klee’s Measure Problem and a Lower Bound for Union Volume Estimation}},
  booktitle =	{41st International Symposium on Computational Geometry (SoCG 2025)},
  pages =	{25:1--25:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-370-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{332},
  editor =	{Aichholzer, Oswin and Wang, Haitao},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2025.25},
  URN =		{urn:nbn:de:0030-drops-231778},
  doi =		{10.4230/LIPIcs.SoCG.2025.25},
  annote =	{Keywords: approximation, volume of union, union of objects, query complexity}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.20

A Formal Language Perspective on Factorized Representations

Authors: Benny Kimelfeld, Wim Martens, and Matthias Niewerth

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Factorized representations (FRs) are a well-known tool to succinctly represent results of join queries and have been originally defined using the named database perspective. We define FRs in the unnamed database perspective and use them to establish several new connections. First, unnamed FRs can be exponentially more succinct than named FRs, but this difference can be alleviated by imposing a disjointness condition on columns. Conversely, named FRs can also be exponentially more succinct than unnamed FRs. Second, unnamed FRs are the same as (i.e., isomorphic to) context-free grammars for languages in which each word has the same length. This tight connection allows us to transfer a wide range of results on context-free grammars to database factorization; of which we offer a selection in the paper. Third, when we generalize unnamed FRs to arbitrary sets of tuples, they become a generalization of path multiset representations, a formalism that was recently introduced to succinctly represent sets of paths in the context of graph database query evaluation.

Cite as

Benny Kimelfeld, Wim Martens, and Matthias Niewerth. A Formal Language Perspective on Factorized Representations. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 20:1-20:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{kimelfeld_et_al:LIPIcs.ICDT.2025.20,
  author =	{Kimelfeld, Benny and Martens, Wim and Niewerth, Matthias},
  title =	{{A Formal Language Perspective on Factorized Representations}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{20:1--20:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.20},
  URN =		{urn:nbn:de:0030-drops-229614},
  doi =		{10.4230/LIPIcs.ICDT.2025.20},
  annote =	{Keywords: Databases, relational databases, graph databases, factorized databases, regular path queries, compact representations}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.22

Queries with External Predicates

Authors: Paolo Guagliardo, Leonid Libkin, Victor Marsault, Wim Martens, Filip Murlak, Liat Peterfreund, and Cristina Sirangelo

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Real-life query languages feature external predicates such as user-defined functions or built-in arithmetic and string operations. These predicates are often infinite, potentially leading to unsafe or non-computable queries. To overcome this, traditional languages such as SQL, put significant syntactic restrictions on the use of external predicates. These restrictions have been relaxed in a number of modern query languages, each doing it in their own way. Our goal therefore is to provide a theoretical basis for querying with external predicates. To this end, we formalize queries with external predicates based on the notion of access patterns. We develop a suitable evaluation model, based on Turing machines with oracles, and tailor the classical notion of query safety to it. Since query safety is undecidable in general, we can only produce sufficient conditions for guaranteeing safety. We do so by developing an inference system to derive safety and computability for relational algebra, first-order logic, as well as for a language that combines them both.

Cite as

Paolo Guagliardo, Leonid Libkin, Victor Marsault, Wim Martens, Filip Murlak, Liat Peterfreund, and Cristina Sirangelo. Queries with External Predicates. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 22:1-22:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{guagliardo_et_al:LIPIcs.ICDT.2025.22,
  author =	{Guagliardo, Paolo and Libkin, Leonid and Marsault, Victor and Martens, Wim and Murlak, Filip and Peterfreund, Liat and Sirangelo, Cristina},
  title =	{{Queries with External Predicates}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{22:1--22:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.22},
  URN =		{urn:nbn:de:0030-drops-229635},
  doi =		{10.4230/LIPIcs.ICDT.2025.22},
  annote =	{Keywords: External predicates, Query safety, Computational model, Oracles, Infinite predicates, Access patterns, Relational algebra, First-order logic}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.6

PAC: Computing Join Queries with Semi-Covers

Authors: Heba Aamer and Bas Ketsman

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

An increased and growing interest in large-scale data processing has triggered a demand for specialized algorithms that thrive in massively parallel shared-nothing systems. To answer the question of how to efficiently compute join queries in this setting, a rich line of research has emerged specifically for the Massively Parallel Communication (MPC) model. In the MPC model, algorithms are executed in rounds, with each round consisting of a synchronized communication phase and a separate local computation phase. The main cost measure is the load of the algorithm, defined as the maximum number of messages received by any server in any round. We study worst-case optimal algorithms for the join query evaluation problem in the constant-round MPC model. In the single-round variant of MPC, the worst-case optimal load for this problem is well understood and algorithms exist that guarantee this load for any join query. In the constant-round variant of MPC, queries can often be computed with a lower load compared to the single-round variant, but the worst-case optimal load is only known for specific classes of join queries, including graph-like and acyclic join queries, and the associated algorithms use very different techniques. In this paper, we propose a new constant-round MPC algorithm for computing join queries. Our algorithm is correct for every join query and its load matches (up to a polylog factor) the worst-case optimal load for at least all join queries that are acyclic or graph-like.

Cite as

Heba Aamer and Bas Ketsman. PAC: Computing Join Queries with Semi-Covers. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 6:1-6:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{aamer_et_al:LIPIcs.ICDT.2025.6,
  author =	{Aamer, Heba and Ketsman, Bas},
  title =	{{PAC: Computing Join Queries with Semi-Covers}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{6:1--6:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.6},
  URN =		{urn:nbn:de:0030-drops-229474},
  doi =		{10.4230/LIPIcs.ICDT.2025.6},
  annote =	{Keywords: Worst-case optimal load, MPC model, join queries}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.31

On Deciding the Data Complexity of Answering Linear Monadic Datalog Queries with LTL Operators

Authors: Alessandro Artale, Anton Gnatenko, Vladislav Ryzhikov, and Michael Zakharyaschev

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Our concern is the data complexity of answering linear monadic datalog queries whose atoms in the rule bodies can be prefixed by operators of linear temporal logic LTL. We first observe that, for data complexity, answering any connected query with operators ○/○- (at the next/previous moment) is either in AC⁰, or in ACC⁰\AC⁰, or NC¹-complete, or L-hard and in NL. Then we show that the problem of deciding L-hardness of answering such queries is PSpace-complete, while checking membership in the classes AC⁰ and ACC⁰ as well as NC¹-completeness can be done in ExpSpace. Finally, we prove that membership in AC⁰ or in ACC⁰, NC¹-completeness, and L-hardness are undecidable for queries with operators ◇/◇- (sometime in the future/past) provided that NC¹ ≠ NL and L ≠ NL.

Cite as

Alessandro Artale, Anton Gnatenko, Vladislav Ryzhikov, and Michael Zakharyaschev. On Deciding the Data Complexity of Answering Linear Monadic Datalog Queries with LTL Operators. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 31:1-31:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{artale_et_al:LIPIcs.ICDT.2025.31,
  author =	{Artale, Alessandro and Gnatenko, Anton and Ryzhikov, Vladislav and Zakharyaschev, Michael},
  title =	{{On Deciding the Data Complexity of Answering Linear Monadic Datalog Queries with LTL Operators}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{31:1--31:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.31},
  URN =		{urn:nbn:de:0030-drops-229723},
  doi =		{10.4230/LIPIcs.ICDT.2025.31},
  annote =	{Keywords: Linear monadic datalog, linear temporal logic, data complexity}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.27

Parallel Query Processing with Heterogeneous Machines

Authors: Simon Frisk and Paraschos Koutris

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

We study the problem of computing a full Conjunctive Query in parallel using p heterogeneous machines. Our computational model is similar to the MPC model, but each machine has its own cost function mapping from the number of bits it receives to a cost. An optimal algorithm should minimize the maximum cost across all machines. We consider algorithms over a single communication round and give a lower bound and matching upper bound for databases where each relation has the same cardinality. We do this for both linear cost functions like in previous work, but also for more general cost functions. For databases with relations of different cardinalities, we also find a lower bound, and give matching upper bounds for specific queries like the cartesian product, the join, the star query, and the triangle query. Our approach is inspired by the HyperCube algorithm, but there are additional challenges involved when machines have heterogeneous cost functions.

Cite as

Simon Frisk and Paraschos Koutris. Parallel Query Processing with Heterogeneous Machines. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 27:1-27:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{frisk_et_al:LIPIcs.ICDT.2025.27,
  author =	{Frisk, Simon and Koutris, Paraschos},
  title =	{{Parallel Query Processing with Heterogeneous Machines}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{27:1--27:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.27},
  URN =		{urn:nbn:de:0030-drops-229683},
  doi =		{10.4230/LIPIcs.ICDT.2025.27},
  annote =	{Keywords: Joins, Massively Parallel Computation, Heterogeneous}
}

Document

Database Theory in Action

DOI: 10.4230/LIPIcs.ICDT.2025.36

Database Theory in Action: Cypher, GQL, and Regular Path Queries

Authors: Amélie Gheerbrant, Leonid Libkin, Liat Peterfreund, and Alexandra Rogova

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Cypher has so far been the most commonly used query language for property graphs, and served as the foundation of the recently standardized graph query language GQL. In designing the features of GQL, the standards committee addressed the perceived limitations of Cypher. One such limitation is the inability of Cypher, as originally designed, to express all regular path queries (RPQs). Despite this claim having been stated many times as a folklore result, we could not find any proof of it. In this note we formalize the core of Cypher’s pattern matching and formally prove that indeed it falls short of all RPQs, justifying the inclusion of new pattern matching features in GQL.

Cite as

Amélie Gheerbrant, Leonid Libkin, Liat Peterfreund, and Alexandra Rogova. Database Theory in Action: Cypher, GQL, and Regular Path Queries. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 36:1-36:5, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{gheerbrant_et_al:LIPIcs.ICDT.2025.36,
  author =	{Gheerbrant, Am\'{e}lie and Libkin, Leonid and Peterfreund, Liat and Rogova, Alexandra},
  title =	{{Database Theory in Action: Cypher, GQL, and Regular Path Queries}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{36:1--36:5},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.36},
  URN =		{urn:nbn:de:0030-drops-229771},
  doi =		{10.4230/LIPIcs.ICDT.2025.36},
  annote =	{Keywords: Regular path queries, Cypher, GQL, inexpressibility}
}

Document

Position

DOI: 10.4230/TGDK.1.1.5

Knowledge Graphs for the Life Sciences: Recent Developments, Challenges and Opportunities

Authors: Jiaoyan Chen, Hang Dong, Janna Hastings, Ernesto Jiménez-Ruiz, Vanessa López, Pierre Monnin, Catia Pesquita, Petr Škoda, and Valentina Tamma

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1

Abstract

The term life sciences refers to the disciplines that study living organisms and life processes, and include chemistry, biology, medicine, and a range of other related disciplines. Research efforts in life sciences are heavily data-driven, as they produce and consume vast amounts of scientific data, much of which is intrinsically relational and graph-structured. The volume of data and the complexity of scientific concepts and relations referred to therein promote the application of advanced knowledge-driven technologies for managing and interpreting data, with the ultimate aim to advance scientific discovery. In this survey and position paper, we discuss recent developments and advances in the use of graph-based technologies in life sciences and set out a vision for how these technologies will impact these fields into the future. We focus on three broad topics: the construction and management of Knowledge Graphs (KGs), the use of KGs and associated technologies in the discovery of new knowledge, and the use of KGs in artificial intelligence applications to support explanations (explainable AI). We select a few exemplary use cases for each topic, discuss the challenges and open research questions within these topics, and conclude with a perspective and outlook that summarizes the overarching challenges and their potential solutions as a guide for future research.

Cite as

Jiaoyan Chen, Hang Dong, Janna Hastings, Ernesto Jiménez-Ruiz, Vanessa López, Pierre Monnin, Catia Pesquita, Petr Škoda, and Valentina Tamma. Knowledge Graphs for the Life Sciences: Recent Developments, Challenges and Opportunities. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 5:1-5:33, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@Article{chen_et_al:TGDK.1.1.5,
  author =	{Chen, Jiaoyan and Dong, Hang and Hastings, Janna and Jim\'{e}nez-Ruiz, Ernesto and L\'{o}pez, Vanessa and Monnin, Pierre and Pesquita, Catia and \v{S}koda, Petr and Tamma, Valentina},
  title =	{{Knowledge Graphs for the Life Sciences: Recent Developments, Challenges and Opportunities}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{5:1--5:33},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.5},
  URN =		{urn:nbn:de:0030-drops-194791},
  doi =		{10.4230/TGDK.1.1.5},
  annote =	{Keywords: Knowledge graphs, Life science, Knowledge discovery, Explainable AI}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2023.1

A Researcher’s Digest of GQL (Invited Talk)

Authors: Nadime Francis, Amélie Gheerbrant, Paolo Guagliardo, Leonid Libkin, Victor Marsault, Wim Martens, Filip Murlak, Liat Peterfreund, Alexandra Rogova, and Domagoj Vrgoč

Published in: LIPIcs, Volume 255, 26th International Conference on Database Theory (ICDT 2023)

Abstract

GQL (Graph Query Language) is being developed as a new ISO standard for graph query languages to play the same role for graph databases as SQL plays for relational. In parallel, an extension of SQL for querying property graphs, SQL/PGQ, is added to the SQL standard; it shares the graph pattern matching functionality with GQL. Both standards (not yet published) are hard-to-understand specifications of hundreds of pages. The goal of this paper is to present a digest of the language that is easy for the research community to understand, and thus to initiate research on these future standards for querying graphs. The paper concentrates on pattern matching features shared by GQL and SQL/PGQ, as well as querying facilities of GQL.

Cite as

Nadime Francis, Amélie Gheerbrant, Paolo Guagliardo, Leonid Libkin, Victor Marsault, Wim Martens, Filip Murlak, Liat Peterfreund, Alexandra Rogova, and Domagoj Vrgoč. A Researcher’s Digest of GQL (Invited Talk). In 26th International Conference on Database Theory (ICDT 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 255, pp. 1:1-1:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{francis_et_al:LIPIcs.ICDT.2023.1,
  author =	{Francis, Nadime and Gheerbrant, Am\'{e}lie and Guagliardo, Paolo and Libkin, Leonid and Marsault, Victor and Martens, Wim and Murlak, Filip and Peterfreund, Liat and Rogova, Alexandra and Vrgo\v{c}, Domagoj},
  title =	{{A Researcher’s Digest of GQL}},
  booktitle =	{26th International Conference on Database Theory (ICDT 2023)},
  pages =	{1:1--1:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-270-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{255},
  editor =	{Geerts, Floris and Vandevoort, Brecht},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2023.1},
  URN =		{urn:nbn:de:0030-drops-177434},
  doi =		{10.4230/LIPIcs.ICDT.2023.1},
  annote =	{Keywords: GQL, Property Graph, Query Language, Graph Database, Pattern matching, Multi-Graph}
}

@InProceedings{francis_et_al:LIPIcs.ICDT.2023.1,
  author =	{Francis, Nadime and Gheerbrant, Am\'{e}lie and Guagliardo, Paolo and Libkin, Leonid and Marsault, Victor and Martens, Wim and Murlak, Filip and Peterfreund, Liat and Rogova, Alexandra and Vrgo\v{c}, Domagoj},
  title =	{{A Researcher’s Digest of GQL}},
  booktitle =	{26th International Conference on Database Theory (ICDT 2023)},
  pages =	{1:1--1:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-270-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{255},
  editor =	{Geerts, Floris and Vandevoort, Brecht},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2023.1},
  URN =		{urn:nbn:de:0030-drops-177434},
  doi =		{10.4230/LIPIcs.ICDT.2023.1},
  annote =	{Keywords: GQL, Property Graph, Query Language, Graph Database, Pattern matching, Multi-Graph}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2019.8

Fragments of Bag Relational Algebra: Expressiveness and Certain Answers

Authors: Marco Console, Paolo Guagliardo, and Leonid Libkin

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)

Abstract

While all relational database systems are based on the bag data model, much of theoretical research still views relations as sets. Recent attempts to provide theoretical foundations for modern data management problems under the bag semantics concentrated on applications that need to deal with incomplete relations, i.e., relations populated by constants and nulls. Our goal is to provide a complete characterization of the complexity of query answering over such relations in fragments of bag relational algebra. The main challenges that we face are twofold. First, bag relational algebra has more operations than its set analog (e.g., additive union, max-union, min-intersection, duplicate elimination) and the relationship between various fragments is not fully known. Thus we first fill this gap. Second, we look at query answering over incomplete data, which again is more complex than in the set case: rather than certainty and possibility of answers, we now have numerical information about occurrences of tuples. We then fully classify the complexity of finding this information in all the fragments of bag relational algebra.

Cite as

Marco Console, Paolo Guagliardo, and Leonid Libkin. Fragments of Bag Relational Algebra: Expressiveness and Certain Answers. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 8:1-8:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{console_et_al:LIPIcs.ICDT.2019.8,
  author =	{Console, Marco and Guagliardo, Paolo and Libkin, Leonid},
  title =	{{Fragments of Bag Relational Algebra: Expressiveness and Certain Answers}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  pages =	{8:1--8:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.8},
  URN =		{urn:nbn:de:0030-drops-103106},
  doi =		{10.4230/LIPIcs.ICDT.2019.8},
  annote =	{Keywords: bag semantics, relational algebra, expressivity, certain answers, complexity}
}

Document

DOI: 10.4230/DFU.Vol5.10452.129

Query Processing in Data Integration

Authors: Paolo Guagliardo and Piotr Wieczorek

Published in: Dagstuhl Follow-Ups, Volume 5, Data Exchange, Integration, and Streams (2013)

Abstract

In this chapter we illustrate the main techniques for processing queries in data integration. The first part of the chapter focuses on the problem of query answering in the relational setting, and describes approaches based on variants of the chase, along with how to deal with integrity constraints and access patterns. The second part of the chapter investigates query processing in the context of semistructured data, which is best described by graph-based data models, where the expressiveness of query languages not common in traditional database systems allows to point out the subtle differences between query answering and query rewriting. The chapter is closed by a very brief discussion of query processing in data integration with XML and ontologies.

Cite as

Paolo Guagliardo and Piotr Wieczorek. Query Processing in Data Integration. In Data Exchange, Integration, and Streams. Dagstuhl Follow-Ups, Volume 5, pp. 129-160, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2013)

Copy BibTex To Clipboard

@InCollection{guagliardo_et_al:DFU.Vol5.10452.129,
  author =	{Guagliardo, Paolo and Wieczorek, Piotr},
  title =	{{Query Processing in Data Integration}},
  booktitle =	{Data Exchange, Integration, and Streams},
  pages =	{129--160},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-61-3},
  ISSN =	{1868-8977},
  year =	{2013},
  volume =	{5},
  editor =	{Kolaitis, Phokion G. and Lenzerini, Maurizio and Schweikardt, Nicole},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DFU.Vol5.10452.129},
  URN =		{urn:nbn:de:0030-drops-42929},
  doi =		{10.4230/DFU.Vol5.10452.129},
  annote =	{Keywords: Query processing, Data integration, Relational data, Semistructured data}
}

12 Search Results for "Guagliardo, Paolo"

BWT Indexes for Optimal Joins in Graph Databases

Abstract

Cite as

Approximating Klee’s Measure Problem and a Lower Bound for Union Volume Estimation

Abstract

Cite as

A Formal Language Perspective on Factorized Representations

Abstract

Cite as

Queries with External Predicates

Abstract

Cite as

PAC: Computing Join Queries with Semi-Covers

Abstract

Cite as

On Deciding the Data Complexity of Answering Linear Monadic Datalog Queries with LTL Operators

Abstract

Cite as

Parallel Query Processing with Heterogeneous Machines

Abstract

Cite as

Database Theory in Action: Cypher, GQL, and Regular Path Queries

Abstract

Cite as

Knowledge Graphs for the Life Sciences: Recent Developments, Challenges and Opportunities

Abstract

Cite as

A Researcher’s Digest of GQL (Invited Talk)

Abstract

Cite as

Fragments of Bag Relational Algebra: Expressiveness and Certain Answers

Abstract

Cite as

Query Processing in Data Integration

Abstract

Cite as

Thanks for your feedback!

Could not send message