Creative Commons Attribution 4.0 International license
We present solutions for the k-mismatch pattern matching problem with don't cares. Given a text t of length n and a pattern p of length m with don't care symbols and a bound k, our algorithms find all the places that the pattern matches the text with at most k mismatches. We first give an \Theta(n(k + logmlog k) log n) time randomised algorithm which finds the correct answer with high probability. We then present a new deter- ministic \Theta(nk^2 log^m)time solution that uses tools originally developed for group testing. Taking our derandomisation approach further we de- velop an approach based on k-selectors that runs in \Theta(nk polylogm) time. Further, in each case the location of the mismatches at each alignment is also given at no extra cost.
@InProceedings{clifford_et_al:DagSemProc.09281.5,
author = {Clifford, Raphael and Efremo, Klim and Porat, Ely and Rotschild, Amir},
title = {{Pattern matching with don't cares and few errors}},
booktitle = {Search Methodologies},
pages = {1--19},
series = {Dagstuhl Seminar Proceedings (DagSemProc)},
ISSN = {1862-4405},
year = {2009},
volume = {9281},
editor = {Rudolf Ahlswede and Ferdinando Cicalese and Ugo Vaccaro},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.09281.5},
URN = {urn:nbn:de:0030-drops-22442},
doi = {10.4230/DagSemProc.09281.5},
annote = {Keywords: Prime Numbers, Group Testing, Streaming, Pattern Matching}
}