eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2019-06-06
21:1
21:15
10.4230/LIPIcs.CPM.2019.21
article
Streaming Dictionary Matching with Mismatches
Gawrychowski, Paweł
1
Starikovskaya, Tatiana
2
University of Wrocław, 50-137 Wrocław, Poland
DIENS, École normale supérieure, PSL Research University, 75005 Paris, France
In the k-mismatch problem we are given a pattern of length m and a text and must find all locations where the Hamming distance between the pattern and the text is at most k. A series of recent breakthroughs have resulted in an ultra-efficient streaming algorithm for this problem that requires only O(k log m/k) space [Clifford, Kociumaka, Porat, SODA 2019]. In this work, we consider a strictly harder problem called dictionary matching with k mismatches, where we are given a dictionary of d patterns of lengths at most m and must find all their k-mismatch occurrences in the text, and show the first streaming algorithm for it. The algorithm uses O(k d log^k d polylog m) space and processes each position of the text in O(k log^k d polylog m + occ) time, where occ is the number of k-mismatch occurrences of the patterns that end at this position. The algorithm is randomised and outputs correct answers with high probability.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol128-cpm2019/LIPIcs.CPM.2019.21/LIPIcs.CPM.2019.21.pdf
Streaming
multiple pattern matching
Hamming distance