,
Ardavan Shahrabi Farahani,
Sana Kashgouli,
Travis Gagie
Creative Commons Attribution 4.0 International license
Suppose we are asked to index a text T [0..n - 1] such that, given a pattern P [0..m - 1], we can quickly report the maximal substrings of P that each occur in T at least k times. We first show how we can add O (r log n) bits to Rossi et al.’s recent MONI index, where r is the number of runs in the Burrows-Wheeler Transform of T, such that it supports such queries in O (k m log n) time. We then show how, if we are given k at construction time, we can reduce the query time to O (m log n).
@InProceedings{tatarnikov_et_al:LIPIcs.CPM.2023.26,
author = {Tatarnikov, Igor and Shahrabi Farahani, Ardavan and Kashgouli, Sana and Gagie, Travis},
title = {{MONI Can Find k-MEMs}},
booktitle = {34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)},
pages = {26:1--26:14},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-276-1},
ISSN = {1868-8969},
year = {2023},
volume = {259},
editor = {Bulteau, Laurent and Lipt\'{a}k, Zsuzsanna},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2023.26},
URN = {urn:nbn:de:0030-drops-179802},
doi = {10.4230/LIPIcs.CPM.2023.26},
annote = {Keywords: Compact data structures, Burrows-Wheeler Transform, run-length compression, maximal exact matches}
}