eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2021-06-30
5:1
5:20
10.4230/LIPIcs.CPM.2021.5
article
The k-Mappability Problem Revisited
Amir, Amihood
1
2
Boneh, Itai
1
Kondratovsky, Eitan
3
4
Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
Georgia Tech, Atlanta, GA, USA
Department of Computer Science, Bar Ilan University, Ramat Gan, Israel
Cheriton School of Computer Science, Waterloo University, Waterloo, Canada
The k-mappability problem has two integers parameters m and k. For every subword of size m in a text S, we wish to report the number of indices in S in which the word occurs with at most k mismatches.
The problem was lately tackled by Alzamel et al. [Mai Alzamel et al., 2018]. For a text with constant alphabet Σ and k ∈ O(1), they present an algorithm with linear space and O(nlog^{k+1}n) time. For the case in which k = 1 and a constant size alphabet, a faster algorithm with linear space and O(nlog(n)log log(n)) time was presented in [Mai Alzamel et al., 2020].
In this work, we enhance the techniques of [Mai Alzamel et al., 2020] to obtain an algorithm with linear space and O(n log(n)) time for k = 1. Our algorithm removes the constraint of the alphabet being of constant size. We also present linear algorithms for the case of k = 1, |Σ| ∈ O(1) and m = Ω(√n).
https://drops.dagstuhl.de/storage/00lipics/lipics-vol191-cpm2021/LIPIcs.CPM.2021.5/LIPIcs.CPM.2021.5.pdf
Pattern Matching
Hamming Distance
Suffix Tree
Suffix Array