DROPS

Volume

LIPIcs, Volume 75

16th International Symposium on Experimental Algorithms (SEA 2017)

SEA 2017, June 21-23, 2017, London, UK

Editors: Costas S. Iliopoulos, Solon P. Pissis, Simon J. Puglisi, and Rajeev Raman

Document

DOI: 10.4230/LIPIcs.ESA.2025.25

Instance-Optimal Imprecise Convex Hull

Authors: Sarita de Berg, Ivor van der Hoog, Eva Rotenberg, Daniel Rutschmann, and Sampson Wong

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

Imprecise measurements of a point set P = (p₁, …, p_n) can be modelled by a family of regions F = (R₁, …, R_n), where each imprecise region R_i ∈ F contains a unique point p_i ∈ P. A retrieval models an accurate measurement by replacing an imprecise region R_i with its corresponding point p_i. We construct the convex hull of an imprecise point set in the plane, by determining the cyclic ordering of the convex hull vertices of P as efficiently as possible. Efficiency is interpreted in two ways: (i) minimising the number of retrievals, and (ii) the computation time to determine the set of regions that must be retrieved. Previous works focused on only one of these two aspects: either minimising retrievals or optimising algorithmic runtime. Our contribution is the first to simultaneously achieve both. Let r(F, P) denote the minimal number of retrievals required by any algorithm to determine the convex hull of P for a given instance (F, P). For a family F of n constant-complexity polygons, our main result is a reconstruction algorithm that performs Θ(r(F, P)) retrievals in O(r(F, P) log³ n) time. Compared to previous approaches that achieve optimal retrieval counts, we improve the runtime per retrieval from polynomial to polylogarithmic. We extend the generality of previous results to simple k-gons, to pairwise disjoint disks with radii in [1,k], and to unit disks where at most k disks overlap in a single point. Our runtime scales linearly with k.

Cite as

Sarita de Berg, Ivor van der Hoog, Eva Rotenberg, Daniel Rutschmann, and Sampson Wong. Instance-Optimal Imprecise Convex Hull. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 25:1-25:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{deberg_et_al:LIPIcs.ESA.2025.25,
  author =	{de Berg, Sarita and van der Hoog, Ivor and Rotenberg, Eva and Rutschmann, Daniel and Wong, Sampson},
  title =	{{Instance-Optimal Imprecise Convex Hull}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{25:1--25:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.25},
  URN =		{urn:nbn:de:0030-drops-244932},
  doi =		{10.4230/LIPIcs.ESA.2025.25},
  annote =	{Keywords: convex hull, imprecise geometry preprocessing model, partial information}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.105

Parameterized Algorithms for Computing Pareto Sets

Authors: Joshua Marc Könen, Heiko Röglin, and Tarek Stuck

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

The problem of computing the set of Pareto-optimal solutions has been studied for a variety of multiobjective optimization problems. For many such problems, algorithms are known that compute the Pareto set in (weak) output-polynomial time. These algorithms are often based on dynamic programming and by weak output-polynomial time, we mean that the running time depends polynomially on the size of the Pareto set but also on the sizes of the Pareto sets of the subproblems that occur in the dynamic program. For some problems, like the multiobjective minimum spanning tree problem, such algorithms are not known to exist and for other problems, like multiobjective versions of many NP-hard problems, such algorithms cannot exist, unless 𝒫 = 𝒩𝒫. Dynamic programming over tree decompositions is a common technique in parameterized algorithms. In this paper, we study whether this technique can also be applied to compute Pareto sets of multiobjective optimization problems. We first derive an algorithm to compute the Pareto set for the multicriteria s-t cut problem and show how this result can be applied to a polygon aggregation problem arising in cartography that has recently been introduced by Rottmann et al. (GIScience 2021). We also show how to apply these techniques to also compute the Pareto set of the multiobjective minimum spanning tree problem and for the multiobjective TSP. The running time of our algorithms is O(f(w)⋅poly(n,p_{max})), where f is some function in the treewidth w, n is the input size, and p_{max} is an upper bound on the size of the Pareto sets of the subproblems that occur in the dynamic program. Finally, we present an experimental evaluation of computing Pareto sets on real-world instances of polygon aggregation problems. For this matter we devised a task-specific data structure that allows for efficient storage and modification of large sets of Pareto-optimal solutions. Throughout the implementation process, we incorporated several improved strategies and heuristics that significantly reduced both runtime and memory usage, enabling us to solve instances with treewidth of up to 22 within reasonable amount of time. Moreover, we conducted a preprocessing study to compare different tree decompositions in terms of their estimated overall runtime.

Cite as

Joshua Marc Könen, Heiko Röglin, and Tarek Stuck. Parameterized Algorithms for Computing Pareto Sets. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 105:1-105:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{konen_et_al:LIPIcs.ESA.2025.105,
  author =	{K\"{o}nen, Joshua Marc and R\"{o}glin, Heiko and Stuck, Tarek},
  title =	{{Parameterized Algorithms for Computing Pareto Sets}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{105:1--105:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.105},
  URN =		{urn:nbn:de:0030-drops-245749},
  doi =		{10.4230/LIPIcs.ESA.2025.105},
  annote =	{Keywords: parameterized algorithms, treewidth, multicriteria optimization problems, multicriteria MST, multicriteria TSP, polygon aggregation}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.82

Buffered Partially-Persistent External-Memory Search Trees

Authors: Gerth Stølting Brodal, Casper Moldrup Rysgaard, and Rolf Svenning

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

We present an optimal partially-persistent external-memory search tree with amortized I/O bounds matching those achieved by the non-persistent B^{ε}-tree by Brodal and Fagerberg [SODA 2003]. In a partially-persistent data structure, each update creates a new version. All past versions can be queried, but only the current version can be updated. Operations should be efficient with respect to the size N_v of the accessed version v. For any parameter 0 < ε < 1, our data structure supports insertions and deletions in amortized 𝒪(1/(ε B^{1 - ε}) log_B N_v) I/Os, where B is the external-memory block size. It also supports successor and range reporting queries in amortized 𝒪(1/ε log_B N_v + K/B) I/Os, where K is the number of keys reported. The space usage of the data structure is linear in the total number of updates. We make the standard and minimal assumption that the internal memory has size M ≥ 2B. The previous state-of-the-art external-memory partially-persistent search tree by Arge, Danner and Teh [JEA 2003] supports all operations in worst-case 𝒪(log_B N_v + K/B) I/Os, matching the bounds achieved by the classical B-tree by Bayer and McCreight [Acta Informatica 1972]. Our data structure successfully combines buffering updates with partial persistence. The I/O bounds can also be achieved in the worst-case sense, by slightly modifying our data structure and under the requirement that the memory size M = Ω(B^{1-ε} log₂(max_v N_v)). For updates, where the I/O bound is o(1), we assume that the I/Os are performed evenly spread out among the updates (by performing buffer-overflows incrementally). The worst-case result slightly improves the memory requirement over the previous ephemeral external-memory dictionary by Das, Iacono, and Nekrich (ISAAC 2022), who achieved matching worst-case I/O bounds but required M = Ω(B log_B N), where N is the size of the current dictionary.

Cite as

Gerth Stølting Brodal, Casper Moldrup Rysgaard, and Rolf Svenning. Buffered Partially-Persistent External-Memory Search Trees. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 82:1-82:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{brodal_et_al:LIPIcs.ESA.2025.82,
  author =	{Brodal, Gerth St{\o}lting and Rysgaard, Casper Moldrup and Svenning, Rolf},
  title =	{{Buffered Partially-Persistent External-Memory Search Trees}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{82:1--82:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.82},
  URN =		{urn:nbn:de:0030-drops-245507},
  doi =		{10.4230/LIPIcs.ESA.2025.82},
  annote =	{Keywords: B-tree, buffered updates, partial persistence, external memory}
}

Document

DOI: 10.4230/LIPIcs.WADS.2025.35

Succinct Data Structures for Chordal Graph with Bounded Leafage or Vertex Leafage

Authors: Meng He and Kaiyu Wu

Published in: LIPIcs, Volume 349, 19th International Symposium on Algorithms and Data Structures (WADS 2025)

Abstract

We improve the recent succinct data structure result of Balakrishnan et al. for chordal graphs with bounded vertex leafage (SWAT 2024). A chordal graph is a widely studied graph class which can be characterized as the intersection graph of subtrees of a host tree, denoted as a tree representation of the chordal graph. The vertex leafage and leafage parameters of a chordal graph deal with the existence of a tree representation with a bounded number of leaves in either the subtrees representing the vertices or the host tree itself. We simplify the lower bound proof of Balakrishnan et al. which applied to only chordal graphs with bounded vertex leafage, and extend it to a lower bound proof for chordal graphs with bounded leafage as well. For both classes of graphs, the information-theoretic lower bound we (re-)obtain for k = o(n) is (k-1)nlog n - knlog k - o(knlog n) bits, where the leafage or vertex leafage of the graph is at most k = o(n). We further extend the range of the parameter k to Θ(n) as well. Then we give a succinct data structure using (k-1)nlog (n/k) + o(knlog n) bits to answer adjacent queries, which test the adjacency between pairs of vertices, in O((log k)/(log log n) + 1) time compared to the O(klog n) time of the data structure of Balakrishnan et al. For the neighborhood query which lists the neighbours of a given vertex, our query time is O((log n)/(log log n)) per neighbour compared to O(k²log n) per neighbour. We also extend the data structure ideas to obtain a succinct data structure for chordal graphs with bounded leafage k, answering an open question of Balakrishnan et al. Our succinct data structure, which uses (k-1)nlog (n/k) + o(knlog n) bits, has query time O(1) for the adjacent query and O(1) per neighbour for the neighborhood query. Using slightly more space (an additional (1+ε)nlog n bits for any ε > 0) allows distance queries, which compute the number of edges in the shortest path between two given vertices, to be answered in O(1) time as well.

Cite as

Meng He and Kaiyu Wu. Succinct Data Structures for Chordal Graph with Bounded Leafage or Vertex Leafage. In 19th International Symposium on Algorithms and Data Structures (WADS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 349, pp. 35:1-35:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{he_et_al:LIPIcs.WADS.2025.35,
  author =	{He, Meng and Wu, Kaiyu},
  title =	{{Succinct Data Structures for Chordal Graph with Bounded Leafage or Vertex Leafage}},
  booktitle =	{19th International Symposium on Algorithms and Data Structures (WADS 2025)},
  pages =	{35:1--35:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-398-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{349},
  editor =	{Morin, Pat and Oh, Eunjin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WADS.2025.35},
  URN =		{urn:nbn:de:0030-drops-242660},
  doi =		{10.4230/LIPIcs.WADS.2025.35},
  annote =	{Keywords: Chordal Graph, Leafage, Vertex Leafage, Succinct Data Structure}
}

Document

DOI: 10.4230/LIPIcs.WABI.2025.17

An Efficient Data Structure and Algorithm for Long-Match Query in Run-Length Compressed BWT

Authors: Ahsan Sanaullah, Degui Zhi, and Shaojie Zhang

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)

Abstract

String matching problems in bioinformatics are typically for finding exact substring matches between a query and a reference text. Previous formulations often focus on maximum exact matches (MEMs). However, multiple occurrences of substrings of the query in the text that are long enough but not maximal may not be captured by MEMs. Such long matches can be informative, especially when the text is a collection of similar sequences such as genomes. In this paper, we describe a new type of match between a pattern and a text that aren't necessarily maximal in the query, but still contain useful matching information: locally maximal exact matches (LEMs). There are usually a large amount of LEMs, so we only consider those above some length threshold ℒ. These are referred to as long LEMs. The purpose of long LEMs is to capture substring matches between a query and a text that are not necessarily maximal in the pattern but still long enough to be important. Therefore efficient long LEMs finding algorithms are desired for these datasets. However, these datasets are too large to query on traditional string indexes. Fortunately, these datasets are very repetitive. Recently, compressed string indexes that take advantage of the redundancy in the data but retain efficient querying capability have been proposed as a solution. We therefore give an efficient algorithm for computing all the long LEMs of a query and a text in a BWT runs compressed string index. We describe an O(m+occ) expected time algorithm that relies on an O(r) words space string index for outputting all long LEMs of a pattern with respect to a text given the matching statistics of the pattern with respect to the text. Here m is the length of the query, occ is the number of long LEMs outputted, and r is the number of runs in the BWT of the text. The O(r) space string index we describe relies on an adaptation of the move data structure by Nishimoto and Tabei. We are able to support LCP[i] queries in constant time given SA[i]. In other words, we answer PLCP[i] queries in constant time. These PLCP queries enable the efficient long LEM query. Long LEMs may provide useful similarity information between a pattern and a text that MEMs may ignore. This information is particularly useful in pangenome and biobank scale haplotype panel contexts.

Cite as

Ahsan Sanaullah, Degui Zhi, and Shaojie Zhang. An Efficient Data Structure and Algorithm for Long-Match Query in Run-Length Compressed BWT. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 17:1-17:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{sanaullah_et_al:LIPIcs.WABI.2025.17,
  author =	{Sanaullah, Ahsan and Zhi, Degui and Zhang, Shaojie},
  title =	{{An Efficient Data Structure and Algorithm for Long-Match Query in Run-Length Compressed BWT}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{17:1--17:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.17},
  URN =		{urn:nbn:de:0030-drops-239433},
  doi =		{10.4230/LIPIcs.WABI.2025.17},
  annote =	{Keywords: BWT, LEM, Long LEM, MEM, Run Length Compressed BWT, Move Data Structure, Pangenome}
}

Document

Research

DOI: 10.4230/OASIcs.Grossi.8

Enumeration of Ordered Trees with Leaf Restrictions

Authors: Yasuaki Kobayashi, Dominik Köppl, Yasuko Matsui, Hirotaka Ono, Toshiki Saitoh, and Yushi Uno

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)

Abstract

An α-ary tree for a constant α ≥ 2 is a rooted tree in which each node has at most α children. A node having no children is called a leaf. For a given rooted tree and a node v, the number of edges from the root to v is called the depth of v. We call a vector w = (w_1,w_2,…, w_d) of nonnegative integers an (α-ary) distribution if there is an α-ary tree T such that the number of leaves at each depth i ∈ [1..d] in T is w_i. Although not every vector of nonnegative integers is a distribution, a distribution can be associated with many α-ary trees. In this paper, we present an algorithm to enumerate all α-ary trees for a given distribution. Our algorithm reports the first tree in O(d + ∑_{i = 1}^d w_i) time, and then each subsequent α-ary tree in O(max_{i = 1}^d w_i) time by representing each tree as the difference from the previous one. The algorithm can be restricted to computing all trees that are full, i.e., trees whose nodes have exactly α or no children.

Cite as

Yasuaki Kobayashi, Dominik Köppl, Yasuko Matsui, Hirotaka Ono, Toshiki Saitoh, and Yushi Uno. Enumeration of Ordered Trees with Leaf Restrictions. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 8:1-8:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{kobayashi_et_al:OASIcs.Grossi.8,
  author =	{Kobayashi, Yasuaki and K\"{o}ppl, Dominik and Matsui, Yasuko and Ono, Hirotaka and Saitoh, Toshiki and Uno, Yushi},
  title =	{{Enumeration of Ordered Trees with Leaf Restrictions}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{8:1--8:19},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.8},
  URN =		{urn:nbn:de:0030-drops-238077},
  doi =		{10.4230/OASIcs.Grossi.8},
  annote =	{Keywords: binary trees, ordered trees, rooted trees, enumeration algorithm, constant-time delay}
}

Document

Research

DOI: 10.4230/OASIcs.Grossi.12

Encoding Data Structures for Range Queries on Arrays

Authors: Seungbum Jo and Srinivasa Rao Satti

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)

Abstract

Efficiently processing range queries on arrays is a fundamental problem in computer science, with applications spanning diverse domains such as database management, computational biology, and geographic information systems. A range query retrieves information about a specific segment of an array, such as the sum, minimum, maximum, or median of elements within a given range. The challenge lies in designing data structures that allow such queries to be answered quickly, often in constant or logarithmic time, while keeping space overhead (and preprocessing time) small. Encoding data structures for range queries has emerged as a pivotal area of research due to the increasing demand for high-performance systems handling massive datasets. These structures consider the data together with the queries and aim to store only as much information about the data as is needed to answer the queries. The data structure does not need to access the original data to answer the queries. Encoding-based solutions often leverage techniques from succinct data structures, bit manipulation, and combinatorial optimization to achieve both space and time efficiency. By encoding the array in a manner that preserves critical information, these methods strike a balance between query time and space usage. In this survey article, we explore the landscape of encoding data structures for range queries on arrays, providing a comprehensive overview of some important results on space-efficient encodings for various types of range query.

Cite as

Seungbum Jo and Srinivasa Rao Satti. Encoding Data Structures for Range Queries on Arrays. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 12:1-12:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{jo_et_al:OASIcs.Grossi.12,
  author =	{Jo, Seungbum and Satti, Srinivasa Rao},
  title =	{{Encoding Data Structures for Range Queries on Arrays}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{12:1--12:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.12},
  URN =		{urn:nbn:de:0030-drops-238116},
  doi =		{10.4230/OASIcs.Grossi.12},
  annote =	{Keywords: range queries, RMQ, Cartesian tree, top-k queries, range median, range mode}
}

Document

Research

DOI: 10.4230/OASIcs.Grossi.15

Wavelet Tree, Part I: A Brief History

Authors: Paolo Ferragina, Raffaele Giancarlo, Giovanni Manzini, Giovanna Rosone, Rossano Venturini, and Jeffrey Scott Vitter

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)

Abstract

The Wavelet Tree data structure introduced in Grossi, Gupta, and Vitter [Grossi et al., 2003] is a space-efficient technique for rank and select queries that generalizes from binary symbols to an arbitrary multisymbol alphabet. Over the last two decades, it has become a pivotal tool in modern full-text indexing and data compression because of its properties and capabilities in compressing and indexing data, with many applications to information retrieval, genome analysis, data mining, and web search. In this paper, we survey the fascinating history and impact of Wavelet Trees; no doubt many more developments are yet to come. Our survey borrows some content from the authors' earlier works. This paper is divided into two parts: one (this one) giving a brief history of Wavelet Trees, including its varieties and practical implementations, dedicated to this Festschrift’s honoree Roberto Grossi; the second part deals with Wavelet Tree-based text indexing and is included in the Festschrift dedicated to Giovanni Manzini [Ferragina et al., 2025].

Cite as

Paolo Ferragina, Raffaele Giancarlo, Giovanni Manzini, Giovanna Rosone, Rossano Venturini, and Jeffrey Scott Vitter. Wavelet Tree, Part I: A Brief History. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 15:1-15:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{ferragina_et_al:OASIcs.Grossi.15,
  author =	{Ferragina, Paolo and Giancarlo, Raffaele and Manzini, Giovanni and Rosone, Giovanna and Venturini, Rossano and Vitter, Jeffrey Scott},
  title =	{{Wavelet Tree, Part I: A Brief History}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{15:1--15:11},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.15},
  URN =		{urn:nbn:de:0030-drops-238143},
  doi =		{10.4230/OASIcs.Grossi.15},
  annote =	{Keywords: Wavelet tree, data compression, text indexing, compressed suffix array, Burrows-Wheeler transform, rank and select}
}

@InProceedings{ferragina_et_al:OASIcs.Grossi.15,
  author =	{Ferragina, Paolo and Giancarlo, Raffaele and Manzini, Giovanni and Rosone, Giovanna and Venturini, Rossano and Vitter, Jeffrey Scott},
  title =	{{Wavelet Tree, Part I: A Brief History}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{15:1--15:11},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.15},
  URN =		{urn:nbn:de:0030-drops-238143},
  doi =		{10.4230/OASIcs.Grossi.15},
  annote =	{Keywords: Wavelet tree, data compression, text indexing, compressed suffix array, Burrows-Wheeler transform, rank and select}
}

Document

Research

DOI: 10.4230/OASIcs.Grossi.16

Faster Range LCP Queries in Linear Space

Authors: Yakov Nekirch and Sharma V. Thankachan

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)

Abstract

A range LCP query rlcp(α,β) on a text T[1 .. n] asks to return the length of the longest common prefix of any two suffixes of T with starting positions in a range [α,β]. In this paper we describe a data structure that uses O(n) space and supports range LCP queries in time O(log^ε n) for any constant ε > 0. Our result is the fastest currently known linear-space solution for this problem.

Cite as

Yakov Nekirch and Sharma V. Thankachan. Faster Range LCP Queries in Linear Space. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 16:1-16:6, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{nekirch_et_al:OASIcs.Grossi.16,
  author =	{Nekirch, Yakov and Thankachan, Sharma V.},
  title =	{{Faster Range LCP Queries in Linear Space}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{16:1--16:6},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.16},
  URN =		{urn:nbn:de:0030-drops-238158},
  doi =		{10.4230/OASIcs.Grossi.16},
  annote =	{Keywords: Data Structures, String Algorithms, Longest Common Prefix}
}

Document

DOI: 10.4230/OASIcs.Manzini.11

Circular Dictionary Matching Using Extended BWT

Authors: Wing-Kai Hon, Rahul Shah, and Sharma V. Thankachan

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)

Abstract

The dictionary matching problem involves preprocessing a set of strings (patterns) into a data structure that efficiently identifies all occurrences of these patterns within a query string (text). In this work, we investigate a variation of this problem, termed circular dictionary matching, where the patterns are circular, meaning their cyclic shifts are also considered valid patterns. Such patterns naturally occur in areas such as bioinformatics and computational geometry. Based on the extended Burrows-Wheeler Transformation (eBWT), we design a space-efficient solution for this problem. Specifically, we show that a dictionary of d circular patterns of total length n can be indexed in nlog σ + O(n+dlog n+σ log n) bits of space and support circular dictionary matching on a query text T in O((|T|+occ)log n) time, where σ represents the size of the underlying alphabet and occ represents the output size.

Cite as

Wing-Kai Hon, Rahul Shah, and Sharma V. Thankachan. Circular Dictionary Matching Using Extended BWT. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 11:1-11:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{hon_et_al:OASIcs.Manzini.11,
  author =	{Hon, Wing-Kai and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{Circular Dictionary Matching Using Extended BWT}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{11:1--11:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.11},
  URN =		{urn:nbn:de:0030-drops-239195},
  doi =		{10.4230/OASIcs.Manzini.11},
  annote =	{Keywords: String algorithms, Burrows-Wheeler transformation, suffix trees, succinct data structures}
}

Document

DOI: 10.4230/OASIcs.Manzini.5

FM-Adaptive: A Practical Data-Aware FM-Index

Authors: Hongwei Huo, Zongtao He, Pengfei Liu, and Jeffrey Scott Vitter

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)

Abstract

The FM-index provides an important solution for efficient retrieval and search in textual big data. Its variants have been widely used in many fields including information retrieval, genome analysis, and web searching. In this paper, we propose improvements via a new compressed representation of the wavelet tree of the Burrows-Wheeler transform of the input text, which incorporates the gap γ-encoding. Our theoretical analysis shows that the new index, called FM-Adaptive, achieves asymptotic space optimality within a factor of 2 in the leading term, but it has a better compression and faster retrieval in practice than the competitive optimal compression boosting used in previous FM-indexes. We present a practical improved locate algorithm that provides substantially faster locating time based upon memoization, which takes advantage of the overlapping subproblems property. We design the lookup table for accelerated decoding to support fast pattern matching in a text. Extensive experiments demonstrate that FM-Adaptive provides faster query performance, often by a considerable amount, and/or comparable or better compression than other state-of-the-art FM-index methods.

Cite as

Hongwei Huo, Zongtao He, Pengfei Liu, and Jeffrey Scott Vitter. FM-Adaptive: A Practical Data-Aware FM-Index. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 5:1-5:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{huo_et_al:OASIcs.Manzini.5,
  author =	{Huo, Hongwei and He, Zongtao and Liu, Pengfei and Vitter, Jeffrey Scott},
  title =	{{FM-Adaptive: A Practical Data-Aware FM-Index}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{5:1--5:23},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.5},
  URN =		{urn:nbn:de:0030-drops-239139},
  doi =		{10.4230/OASIcs.Manzini.5},
  annote =	{Keywords: Text indexing, Burrows-Wheeler transform, Compressed wavelet trees, Entropy-compressed, Compressed data structures}
}

Document

DOI: 10.4230/OASIcs.Manzini.4

Wavelet Tree, Part II: Text Indexing

Authors: Paolo Ferragina, Raffaele Giancarlo, Roberto Grossi, Giovanna Rosone, Rossano Venturini, and Jeffrey Scott Vitter

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)

Abstract

The Wavelet Tree data structure introduced in Grossi, Gupta, and Vitter [Grossi et al., 2003] is a space-efficient technique for rank and select queries that generalizes from binary symbols to an arbitrary multisymbol alphabet. Over the last two decades, it has become a pivotal tool in modern full-text indexing and data compression because of its properties and capabilities in compressing and indexing data, with many applications to information retrieval, genome analysis, data mining, and web search. In this paper, we survey the fascinating history and impact of Wavelet Trees; no doubt many more developments are yet to come. Our survey borrows some content from the authors' earlier works. This paper is divided into two parts: The first part gives a brief history of Wavelet Trees, including its varieties and practical implementations, which appears in the Festschrift dedicated to Roberto Grossi; the second part (this one) deals with Wavelet Tree-based text indexing and is included in the Festschrift dedicated to Giovanni Manzini.

Cite as

Paolo Ferragina, Raffaele Giancarlo, Roberto Grossi, Giovanna Rosone, Rossano Venturini, and Jeffrey Scott Vitter. Wavelet Tree, Part II: Text Indexing. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 4:1-4:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{ferragina_et_al:OASIcs.Manzini.4,
  author =	{Ferragina, Paolo and Giancarlo, Raffaele and Grossi, Roberto and Rosone, Giovanna and Venturini, Rossano and Vitter, Jeffrey Scott},
  title =	{{Wavelet Tree, Part II: Text Indexing}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{4:1--4:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.4},
  URN =		{urn:nbn:de:0030-drops-239127},
  doi =		{10.4230/OASIcs.Manzini.4},
  annote =	{Keywords: Wavelet tree, data compression, text indexing, compressed suffix array, Burrows-Wheeler transform, rank and select}
}

Document

DOI: 10.4230/LIPIcs.SEA.2025.15

Succinct Rank Dictionaries Revisited

Authors: Saska Dönges and Simon J. Puglisi

Published in: LIPIcs, Volume 338, 23rd International Symposium on Experimental Algorithms (SEA 2025)

Abstract

We study data structures for representing sets of m elements drawn from the universe [0..n-1] that support access and rank queries. A classical approach to this problem, foundational to the fields of succinct and compact data structures, is to represent the set as a bitvector X of n bits, where X[i] = 1 iff i is a member of the set. Our particular focus in this paper is on structures taking log₂{n choose m} + o(n) bits, which stem from the so-called RRR bitvector scheme (Raman et al., ACM Trans. Alg., 2007). In RRR bitvectors, X is conceptually divided into n/b blocks of b bits each. A block containing c 1 bits is then encoded using log₂ b + log₂{b choose c} bits, where log b bits are used to encode c, and log₂{b choose c} bits are used to say which of the {b choose c} possible combinations the block represents. In all existing RRR implementations the code assigned to a block is its lexicographical rank amongst the {b choose c} combinations of its class. In this paper we explore alternative non-lexicographical assignments of codes to blocks. We show these approaches can lead to faster query times and offer relevant space-time trade-offs in practice compared to state-of-the-art implementations (Gog and Petri, Software, Prac. & Exp., 2014) from the Succinct Data Structures Library.

Cite as

Saska Dönges and Simon J. Puglisi. Succinct Rank Dictionaries Revisited. In 23rd International Symposium on Experimental Algorithms (SEA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 338, pp. 15:1-15:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{donges_et_al:LIPIcs.SEA.2025.15,
  author =	{D\"{o}nges, Saska and Puglisi, Simon J.},
  title =	{{Succinct Rank Dictionaries Revisited}},
  booktitle =	{23rd International Symposium on Experimental Algorithms (SEA 2025)},
  pages =	{15:1--15:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-375-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{338},
  editor =	{Mutzel, Petra and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.15},
  URN =		{urn:nbn:de:0030-drops-232530},
  doi =		{10.4230/LIPIcs.SEA.2025.15},
  annote =	{Keywords: data structures, data compression, succinct data structures, compressed data structures, weighted de Bruijn sequence, text indexing, string algorithms}
}

Document

DOI: 10.4230/LIPIcs.SEA.2025.8

A Simple Integer Successor-Delete Data Structure

Authors: Gerth Stølting Brodal

Published in: LIPIcs, Volume 338, 23rd International Symposium on Experimental Algorithms (SEA 2025)

Abstract

We consider a simple decremental data structure for maintaining a set of integers, that supports initializing the set to {1,2,…,n} followed by d deletions and s successor queries in arbitrary order in total 𝒪(n+d+s⋅(1+log_{max(2,s/n)} min(s,n))) time. The data structure consists of a single array of n integers. A straightforward modification allows the data structure to also support p predecessor and r range queries, with a total output k, in total 𝒪(n+d+k+q ⋅ (1+log_{max(2,q/n)} min(q,n))) time, where q = s+p+r. The data structure is essentially a special case of the classic union-find data structure with path compression but with unweighted linking (i.e., without linking by rank or size), that is known to achieve logarithmic amortized time bounds (Tarjan and van Leeuwen, 1984). In this paper we study the efficiency of this simple data structure, and compare it to other, theoretically superior, data structures.

Cite as

Gerth Stølting Brodal. A Simple Integer Successor-Delete Data Structure. In 23rd International Symposium on Experimental Algorithms (SEA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 338, pp. 8:1-8:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{brodal:LIPIcs.SEA.2025.8,
  author =	{Brodal, Gerth St{\o}lting},
  title =	{{A Simple Integer Successor-Delete Data Structure}},
  booktitle =	{23rd International Symposium on Experimental Algorithms (SEA 2025)},
  pages =	{8:1--8:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-375-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{338},
  editor =	{Mutzel, Petra and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.8},
  URN =		{urn:nbn:de:0030-drops-232461},
  doi =		{10.4230/LIPIcs.SEA.2025.8},
  annote =	{Keywords: Successor queries, deletions, interval union-find, union-find}
}

68 Search Results for "Raman, Rajeev"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message