Search Results

Documents authored by Arroyuelo, Diego


Document
BWT Indexes for Optimal Joins in Graph Databases

Authors: Diego Arroyuelo and Gonzalo Navarro

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)


Abstract
Graph databases represent data as a labeled directed graph, where the labels refer to properties that connect the entities represented by their source and target vertices. Queries feature, most prominently, sets of edges where source, target, and/or label can be variables; each instantiation of the variables where all the edges occur in the graph is a solution to the query. Worst-case-optimal algorithms to solve those queries have been devised, but they pose significant space requirements. This overhead has hindered the adoption of worst-case-optimal algorithms in real systems. We show that a representation of the graph based on the extended BWT (eBWT), where each edge is seen as an independent string of length 3 (source, label, target) supports worst-case-optimal algorithms while using almost no extra space on top of the raw data. We then show how the idea is generalized to the relational model, where the strings can be longer than 3 and several eBWTs are needed to obtain worst-case optimality. The aim to minimize the amount of space in that case leads to consider novel eBWT variants, where columns other than the last can be chosen. Finally, we show how the same graph representation can be used to solve other typical queries, like finding graph paths that match regular expressions.

Cite as

Diego Arroyuelo and Gonzalo Navarro. BWT Indexes for Optimal Joins in Graph Databases. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 14:1-14:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{arroyuelo_et_al:OASIcs.Manzini.14,
  author =	{Arroyuelo, Diego and Navarro, Gonzalo},
  title =	{{BWT Indexes for Optimal Joins in Graph Databases}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{14:1--14:19},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.14},
  URN =		{urn:nbn:de:0030-drops-239222},
  doi =		{10.4230/OASIcs.Manzini.14},
  annote =	{Keywords: Graph databases, Ring index, extended BWT, compact data structures}
}
Document
Trie-Compressed Adaptive Set Intersection

Authors: Diego Arroyuelo and Juan Pablo Castillo

Published in: LIPIcs, Volume 259, 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)


Abstract
We introduce space- and time-efficient algorithms and data structures for the offline set intersection problem. We show that a sorted integer set S ⊆ [0..u) of n elements can be represented using compressed space while supporting k-way intersections in adaptive O(kδlg(u/δ)) time, δ being the alternation measure introduced by Barbay and Kenyon. Our experimental results suggest that our approaches are competitive in practice, outperforming the most efficient alternatives (Partitioned Elias-Fano indexes, Roaring Bitmaps, and Recursive Universe Partitioning (RUP)) in several scenarios, offering in general relevant space-time trade-offs.

Cite as

Diego Arroyuelo and Juan Pablo Castillo. Trie-Compressed Adaptive Set Intersection. In 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 259, pp. 1:1-1:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{arroyuelo_et_al:LIPIcs.CPM.2023.1,
  author =	{Arroyuelo, Diego and Castillo, Juan Pablo},
  title =	{{Trie-Compressed Adaptive Set Intersection}},
  booktitle =	{34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)},
  pages =	{1:1--1:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-276-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{259},
  editor =	{Bulteau, Laurent and Lipt\'{a}k, Zsuzsanna},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2023.1},
  URN =		{urn:nbn:de:0030-drops-179552},
  doi =		{10.4230/LIPIcs.CPM.2023.1},
  annote =	{Keywords: Set intersection problem, Adaptive Algorithms, Compressed and compact data structures}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail