Creative Commons Attribution 4.0 International license
Motivation. Given a text, a query rank(q, c) counts the number of occurrences of character c among the first q characters of the text. Space-efficient methods to answer these rank queries form an important building block in many succinct data structures. For example, the FM-index [Ferragina and Manzini, 2000] is a widely used data structure that uses rank queries to locate all occurrences of a pattern in a text. In bioinformatics applications, the goal is usually to process large inputs as fast as possible. Thus, data structures should have high throughput when used with many threads. Contributions. We first survey existing results on rank data structures. For the σ = 2 binary alphabet, we then develop BiRank, which has 3.28% space overhead. BiRank merges the central ideas of two recent papers: (1) we interleave (inline) offsets in each cache line of the underlying bit vector [Laws et al., 2024], reducing cache misses, and (2) these offsets are to the middle of each block so that only half of each needs popcounting [Gottlieb and Reinert, 2025]. In QuadRank (14.4% overhead), we extend these techniques to the σ = 4 (DNA) alphabet. Both data structures typically require only a single cache miss per query, making them highly suitable for high-throughput and memory-bound settings. To enable efficient batch-processing, we support prefetching the cache lines required to answer upcoming queries. Results. BiRank and QuadRank are around 1.5× and 2× faster than similar-overhead methods that do not use interleaving. Prefetching gives an additional 2× speedup, at which point the dual-channel DDR4 RAM bandwidth becomes a hard limit on the total throughput. With prefetching, both methods outperform all other methods apart from SPIDER [Laws et al., 2024] by 2×. When using QuadRank with prefetching in a toy count-only FM-index, QuadFm, this results in a smaller size and up to 4× speedup over Genedex, a state-of-the-art batching FM-index implementation. Conclusion. Optimizing data structures for high throughput, by minimizing cache misses and branch-misses and adding support for prefetching, can result in significant speedups when benchmarks are adjusted accordingly.
@InProceedings{grootkoerkamp:LIPIcs.SEA.2026.20,
author = {Groot Koerkamp, Ragnar},
title = {{QuadRank: Engineering a High Throughput Rank}},
booktitle = {24th International Symposium on Experimental Algorithms (SEA 2026)},
pages = {20:1--20:23},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-422-2},
ISSN = {1868-8969},
year = {2026},
volume = {371},
editor = {Aum\"{u}ller, Martin and Finocchi, Irene},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2026.20},
URN = {urn:nbn:de:0030-drops-260248},
doi = {10.4230/LIPIcs.SEA.2026.20},
annote = {Keywords: Rank, Succinct Data Structures, Cache Performance, Prefetching}
}
archived version
archived version