,
Simon J. Puglisi
Creative Commons Attribution 4.0 International license
We study data structures for representing sets of m elements drawn from the universe [0..n-1] that support access and rank queries. A classical approach to this problem, foundational to the fields of succinct and compact data structures, is to represent the set as a bitvector X of n bits, where X[i] = 1 iff i is a member of the set. Our particular focus in this paper is on structures taking log₂{n choose m} + o(n) bits, which stem from the so-called RRR bitvector scheme (Raman et al., ACM Trans. Alg., 2007). In RRR bitvectors, X is conceptually divided into n/b blocks of b bits each. A block containing c 1 bits is then encoded using log₂ b + log₂{b choose c} bits, where log b bits are used to encode c, and log₂{b choose c} bits are used to say which of the {b choose c} possible combinations the block represents. In all existing RRR implementations the code assigned to a block is its lexicographical rank amongst the {b choose c} combinations of its class. In this paper we explore alternative non-lexicographical assignments of codes to blocks. We show these approaches can lead to faster query times and offer relevant space-time trade-offs in practice compared to state-of-the-art implementations (Gog and Petri, Software, Prac. & Exp., 2014) from the Succinct Data Structures Library.
@InProceedings{donges_et_al:LIPIcs.SEA.2025.15,
author = {D\"{o}nges, Saska and Puglisi, Simon J.},
title = {{Succinct Rank Dictionaries Revisited}},
booktitle = {23rd International Symposium on Experimental Algorithms (SEA 2025)},
pages = {15:1--15:18},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-375-1},
ISSN = {1868-8969},
year = {2025},
volume = {338},
editor = {Mutzel, Petra and Prezza, Nicola},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.15},
URN = {urn:nbn:de:0030-drops-232530},
doi = {10.4230/LIPIcs.SEA.2025.15},
annote = {Keywords: data structures, data compression, succinct data structures, compressed data structures, weighted de Bruijn sequence, text indexing, string algorithms}
}
archived version
archived version