,
Lukas Nalbach
Creative Commons Attribution 4.0 International license
We present a static text index called Move-r, which is a highly optimized version of the r-index ([Travis Gagie et al., 2020] Gagie et al., 2020) that encorporates recent theoretical developments of the move data structure ([Takaaki Nishimoto and Yasuo Tabei, 2021] Nishimoto and Tabei, 2021). The r-index is the method of choice for indexing highly repetitive texts, such as different versions of a text document or DNA from the same species, as it exploits the compressibilty of the underlying data. With Move-r, we can answer count- and locate queries 2-35 (typically 15) times as fast as with any other r-index supporting locate queries while being 0.8-2.5 (typically 2) times as large. A Move-r index can be constructed 0.9-2 (typically 2) times as fast while using 1/3-1 (typically 1/2) times as much space.
@InProceedings{bertram_et_al:LIPIcs.SEA.2024.1,
author = {Bertram, Nico and Fischer, Johannes and Nalbach, Lukas},
title = {{Move-r: Optimizing the r-index}},
booktitle = {22nd International Symposium on Experimental Algorithms (SEA 2024)},
pages = {1:1--1:19},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-325-6},
ISSN = {1868-8969},
year = {2024},
volume = {301},
editor = {Liberti, Leo},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.1},
URN = {urn:nbn:de:0030-drops-203662},
doi = {10.4230/LIPIcs.SEA.2024.1},
annote = {Keywords: Compressed Text Index, Burrows-Wheeler Transform}
}
archived version