Clifford, Raphael ; Efremo, Klim ; Porat, Ely ; Rotschild, Amir

Pattern matching with don't cares and few errors

09281.PoratEly.Paper.2244.pdf (0.3 MB)


We present solutions for the k-mismatch pattern matching problem with don't cares. Given a text t of length n and a pattern p of length m with don't care symbols and a bound k, our algorithms find all the places that the pattern matches the text with at most k mismatches. We first give an \Theta(n(k + logmlog k) log n) time randomised algorithm which finds the correct answer with high probability. We then present a new deter- ministic \Theta(nk^2 log^m)time solution that uses tools originally developed for group testing. Taking our derandomisation approach further we de- velop an approach based on k-selectors that runs in \Theta(nk polylogm) time. Further, in each case the location of the mismatches at each alignment is also given at no extra cost.

Seminar: 09281 - Search Methodologies
Issue Date: 2009
Date of publication: 10.11.2009

