Starikovskaya, Tatiana
Streaming Pattern Matching (Invited Talk)
Abstract
Many classical algorithms for string processing assume that the input can be accessed in full via constanttime random access, which poses a serious limitation in the modern era of data deluge. In this talk, we will focus on the streaming model of computation that allows to overcome this issue. In this model of computation, we assume that the input arrives as a stream, one character at a time, which captures a situation when the data are sequential measurements or an output of an algorithm. The space complexity is defined as all the space used, including the space used to store any information about the input, which allows to develop ultraefficient algorithms.
The first streaming algorithm for pattern matching was presented in the seminal paper of Porat and Porat in FOCS 2009. For a pattern of length m, the algorithm uses only O(log m) space, while any classical algorithm requires Ω(m) space. This result served as a foundation of the area of streaming algorithms for pattern matching. After a brief survey of the area, we will discuss two questions in more details: the kmismatch problem and the pattern matching with kedits problem. In the kmismatch problem, one is given a pattern and a text, and the task is to find all substrings of the text that have at most k mismatches with the pattern. The current best algorithm for this problem was given by Clifford, Kociumaka, and Porat in SODA 2019, and for a pattern of length m it uses O(k log m) space and Õ(√k) time per character of the text. In the pattern matching with kedits problem, the task is similar, but one must find substrings that can be transformed into the pattern by at most k edits, i.e. substitutions, insertions, and deletions of a character. For this problem, the first streaming algorithm was presented by Kociumaka, Porat, and Starikovskaya in FOCS 2021. The algorithm takes Õ(poly(k)) space and Õ(poly(k)) time per character of the text.
BibTeX  Entry
@InProceedings{starikovskaya:LIPIcs.ISAAC.2021.1,
author = {Starikovskaya, Tatiana},
title = {{Streaming Pattern Matching}},
booktitle = {32nd International Symposium on Algorithms and Computation (ISAAC 2021)},
pages = {1:11:1},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959772143},
ISSN = {18688969},
year = {2021},
volume = {212},
editor = {Ahn, HeeKap and Sadakane, Kunihiko},
publisher = {Schloss Dagstuhl  LeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2021/15434},
URN = {urn:nbn:de:0030drops154345},
doi = {10.4230/LIPIcs.ISAAC.2021.1},
annote = {Keywords: Streaming algorithms, Pattern matching, Hamming distance, Edit distance}
}
30.11.2021
Keywords: 

Streaming algorithms, Pattern matching, Hamming distance, Edit distance 
Seminar: 

32nd International Symposium on Algorithms and Computation (ISAAC 2021)

Issue date: 

2021 
Date of publication: 

30.11.2021 