DROPS

Document

DOI: 10.4230/LIPIcs.ICDT.2022.14

Inference of Shape Graphs for Graph Databases

Authors: Benoît Groz, Aurélien Lemay, Sławek Staworko, and Piotr Wieczorek

Published in: LIPIcs, Volume 220, 25th International Conference on Database Theory (ICDT 2022)

Abstract

We investigate the problem of constructing a shape graph that describes the structure of a given graph database. We employ the framework of grammatical inference, where the objective is to find an inference algorithm that is both sound, i.e., always producing a schema that validates the input graph, and complete, i.e., able to produce any schema, within a given class of schemas, provided that a sufficiently informative input graph is presented. We identify a number of fundamental limitations that preclude feasible inference. We present inference algorithms based on natural approaches that allow to infer schemas that we argue to be of practical importance.

Cite as

Benoît Groz, Aurélien Lemay, Sławek Staworko, and Piotr Wieczorek. Inference of Shape Graphs for Graph Databases. In 25th International Conference on Database Theory (ICDT 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 220, pp. 14:1-14:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{groz_et_al:LIPIcs.ICDT.2022.14,
  author =	{Groz, Beno\^{i}t and Lemay, Aur\'{e}lien and Staworko, S{\l}awek and Wieczorek, Piotr},
  title =	{{Inference of Shape Graphs for Graph Databases}},
  booktitle =	{25th International Conference on Database Theory (ICDT 2022)},
  pages =	{14:1--14:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-223-5},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{220},
  editor =	{Olteanu, Dan and Vortmeier, Nils},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2022.14},
  URN =		{urn:nbn:de:0030-drops-158889},
  doi =		{10.4230/LIPIcs.ICDT.2022.14},
  annote =	{Keywords: RDF, Schema, Inference, Learning, Fitting, Minimality, Containment}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2020.3

Current Challenges in Graph Databases (Invited Talk)

Authors: Juan L. Reutter

Published in: LIPIcs, Volume 155, 23rd International Conference on Database Theory (ICDT 2020)

Abstract

As graph databases grow in popularity, decades of work in graph query languages and models are materialising in industry standards and in the construction of new graph database systems. However, this surge in graph systems has in turn opened up a series of new, interesting research problems related to graph databases. Our first set of problems has to do with more efficient ways of computing the answers of graph queries, specifically graph patterns, path queries, and combinations between them. Traditionally, researchers in graph databases have pointed out that relational systems are ill-equipped to process these types of queries, and if one looks at the performance of native graph database systems, there is clearly a lot of room for improvement. The talk focuses on two possible directions for improving the state of the art in graph query processing. The first is implementing worst-case optimal algorithms for processing graph patterns that traduce in relational queries with several joins. Some advances are already in development (see e.g. Nguyen, Dung, et al. "Join processing for graph patterns: An old dog with new tricks." GRADES'15. or Hogan, Aidan, et al. "A Worst-Case Optimal Join Algorithm for SPARQL." ISWC’19.), but we are still far from a full fledged solution: most algorithms require complex data structures, or need further support in terms of heuristics to select an order in which joins are processed. Second, we need to understand what is the best way of evaluating path queries (that is, finding all pairs of nodes connected by a path), in such a way that these results can be further integrated with other query results in a graph system pipeline. We already have complexity results regarding path computation and enumeration for different semantics of path queries (see e.g. Martens, Wim, and Tina Trautner. "Evaluation and enumeration problems for regular path queries." ICDT'18. or Bagan, Guillaume, Angela Bonifati, and Benoit Groz. "A trichotomy for regular simple path queries on graphs." PODS'13.), but still very little is known in terms of optimal processing of path queries when inside a tractable fragment. Our second set of problems is related to graph analytics, one of the current selling points of graph databases. Systems should be able to run more complex analytical queries involving tasks such as more complex path finding, centrality or clustering. It is also important to be able to run these algorithms not over native graphs, but perhaps over a certain set of nodes or edges previously selected by a graph query, and one may also want to pose further queries over the result of the analytics task. Finally, all of this should be done in an efficient way, specially in the prospect that graph databases may contain a huge amount of nodes. In this talk I will discuss possible approaches to perform these operations, covering aspects from the design of languages for graph analytics to efficient ways of processing them, and also comparing the expressive power of graph analytics solutions with other forms of graph computation.

Cite as

Juan L. Reutter. Current Challenges in Graph Databases (Invited Talk). In 23rd International Conference on Database Theory (ICDT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 155, p. 3:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{reutter:LIPIcs.ICDT.2020.3,
  author =	{Reutter, Juan L.},
  title =	{{Current Challenges in Graph Databases}},
  booktitle =	{23rd International Conference on Database Theory (ICDT 2020)},
  pages =	{3:1--3:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-139-9},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{155},
  editor =	{Lutz, Carsten and Jung, Jean Christoph},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2020.3},
  URN =		{urn:nbn:de:0030-drops-119272},
  doi =		{10.4230/LIPIcs.ICDT.2020.3},
  annote =	{Keywords: Graph databases, Join algorithms, path queries, graph analytics}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2016.12

Filtering With the Crowd: CrowdScreen Revisited

Authors: Benoit Groz, Ezra Levin, Isaac Meilijson, and Tova Milo

Published in: LIPIcs, Volume 48, 19th International Conference on Database Theory (ICDT 2016)

Abstract

Filtering a set of items, based on a set of properties that can be verified by humans, is a common application of CrowdSourcing. When the workers are error-prone, each item is presented to multiple users, to limit the probability of misclassification. Since the Crowd is a relatively expensive resource, minimizing the number of questions per item may naturally result in big savings. Several algorithms to address this minimization problem have been presented in the CrowdScreen framework by Parameswaran et al. However, those algorithms do not scale well and therefore cannot be used in scenarios where high accuracy is required in spite of high user error rates. The goal of this paper is thus to devise algorithms that can cope with such situations. To achieve this, we provide new theoretical insights to the problem, then use them to develop a new efficient algorithm. We also propose novel optimizations for the algorithms of CrowdScreen that improve their scalability. We complement our theoretical study by an experimental evaluation of the algorithms on a large set of synthetic parameters as well as real-life crowdsourcing scenarios, demonstrating the advantages of our solution.

Cite as

Benoit Groz, Ezra Levin, Isaac Meilijson, and Tova Milo. Filtering With the Crowd: CrowdScreen Revisited. In 19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, pp. 12:1-12:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{groz_et_al:LIPIcs.ICDT.2016.12,
  author =	{Groz, Benoit and Levin, Ezra and Meilijson, Isaac and Milo, Tova},
  title =	{{Filtering With the Crowd: CrowdScreen Revisited}},
  booktitle =	{19th International Conference on Database Theory (ICDT 2016)},
  pages =	{12:1--12:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-002-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{48},
  editor =	{Martens, Wim and Zeume, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2016.12},
  URN =		{urn:nbn:de:0030-drops-57817},
  doi =		{10.4230/LIPIcs.ICDT.2016.12},
  annote =	{Keywords: CrowdSourcing, filtering, algorithms, sprt, hypothesis testing}
}

3 Search Results for "Groz, Benoit"

Inference of Shape Graphs for Graph Databases

Abstract

Cite as

Current Challenges in Graph Databases (Invited Talk)

Abstract

Cite as

Filtering With the Crowd: CrowdScreen Revisited

Abstract

Cite as

Thanks for your feedback!

Could not send message