Communication and Streaming Complexity of Approximate Pattern Matching

Author Tatiana Starikovskaya



PDF
Thumbnail PDF

File

LIPIcs.CPM.2017.13.pdf
  • Filesize: 427 kB
  • 11 pages

Document Identifiers

Author Details

Tatiana Starikovskaya

Cite AsGet BibTex

Tatiana Starikovskaya. Communication and Streaming Complexity of Approximate Pattern Matching. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 78, pp. 13:1-13:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)
https://doi.org/10.4230/LIPIcs.CPM.2017.13

Abstract

We consider the approximate pattern matching problem. Given a text T of length 2n and a pattern P of length n, the task is to decide for each prefix T[1, j] of T if it ends with a string that is at the edit distance at most k from P. If this is the case, we must output the edit distance and the corresponding edit operations. We first show the communication complexity of the problem for the case when Alice and Bob both share the pattern and Alice holds the first half of the text and Bob the second half, and for the case when Alice holds the first half of the text, Bob the second half of the text, and Charlie the pattern. We then develop the first sublinear-space streaming algorithm for the problem. The algorithm is randomised with error probability at most 1/poly(n).
Keywords
  • approximate pattern matching
  • edit distance
  • randomised algorithms
  • streaming algorithms
  • communication complexity

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Djamal Belazzougui. Efficient deterministic single round document exchange for edit distance, 2015. URL: http://arxiv.org/abs/1511.09229.
  2. Djamal Belazzougui and Qin Zhang. Edit distance: Sketching, streaming, and document exchange. In Irit Dinur, editor, Proceedings of the 57th IEEE Annual Symposium on Foundations of Computer Science (FOCS 2016), pages 51-60. IEEE Computer Society, 2016. URL: http://dx.doi.org/10.1109/FOCS.2016.15.
  3. Joshua Brakensiek, Venkatesan Guruswami, and Samuel Zbarsky. Efficient low-redundancy codes for correcting multiple deletions. In Robert Krauthgamer, editor, Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2016), pages 1884-1892. SIAM, 2016. URL: http://dx.doi.org/10.1137/1.9781611974331.ch132.
  4. Diptarka Chakraborty, Elazar Goldenberg, and Michal Koucký. Streaming algorithms for embedding and computing edit distance in the low distance regime. In Daniel Wichs and Yishay Mansour, editors, Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing (STOC 2016), pages 712-725. ACM, 2016. URL: http://dx.doi.org/10.1145/2897518.2897577.
  5. Raphaël Clifford, Allyx Fontaine, Ely Porat, Benjamin Sach, and Tatiana Starikovskaya. The k-mismatch problem revisited. In Robert Krauthgamer, editor, Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2016), pages 2039-2052. SIAM, 2016. URL: http://dx.doi.org/10.1137/1.9781611974331.ch142.
  6. Raphaël Clifford, Markus Jalsenius, and Benjamin Sach. Cell-probe bounds for online edit distance and other pattern matching problems. In Piotr Indyk, editor, Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2015), pages 552-561. SIAM, 2015. URL: http://dx.doi.org/10.1137/1.9781611973730.37.
  7. Raphaël Clifford and Tatiana Starikovskaya. Approximate Hamming distance in a stream. In Ioannis Chatzigiannakis, Michael Mitzenmacher, Yuval Rabani, and Davide Sangiorgi, editors, Proceedings of the 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016), volume 55 of LIPIcs, pages 20:1-20:14. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2016. URL: http://dx.doi.org/10.4230/LIPIcs.ICALP.2016.20.
  8. Graham Cormode, Mike Paterson, Süleyman Cenk Sahinalp, and Uzi Vishkin. Communication complexity of document exchange. In David B. Shmoys, editor, Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2000), pages 197-206. ACM/SIAM, 2000. URL: http://dl.acm.org/citation.cfm?id=338219.338252.
  9. Johannes Fischer and Volker Heun. Theoretical and practical improvements on the RMQ-problem, with applications to LCA and LCE. In Moshe Lewenstein and Gabriel Valiente, editors, Proceedings of the 17th Annual Symposium on Combinatorial Pattern Matching (CPM 2006), volume 4009 of LNCS, pages 36-48. Springer, 2006. URL: http://dx.doi.org/10.1007/11780441_5.
  10. Dan Gusfield. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997. URL: http://dx.doi.org/10.1017/CBO9780511574931.
  11. Hossein Jowhari. Efficient communication protocols for deciding edit distance. In Leah Epstein and Paolo Ferragina, editors, Proceedings of the 20th Annual European Symposium on Algorithms (ESA 2012), volume 7501 of LNCS, pages 648-658. Springer, 2012. URL: http://dx.doi.org/10.1007/978-3-642-33090-2_56.
  12. Gad M. Landau and Uzi Vishkin. Introducing efficient parallelism into approximate string matching and a new serial algorithm. In Juris Hartmanis, editor, Proceedings of the 18th Annual ACM Symposium on Theory of Computing (STOC 1986), pages 220-230. ACM, 1986. URL: http://dx.doi.org/10.1145/12130.12152.
  13. Eugene W. Myers. An O(nd) difference algorithm and its variations. Algorithmica, 1(2):251-266, 1986. URL: http://dx.doi.org/10.1007/BF01840446.
  14. Gonzalo Navarro. A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31-88, March 2001. URL: http://dx.doi.org/10.1145/375360.375365.
  15. Alon Orlitsky. Interactive communication: Balanced distributions, correlated files, and average-case complexity. In Michael Sipser, editor, Proceedings of the 32nd Annual Symposium on Foundations of Computer Science (FOCS 1991), pages 228-238. IEEE Computer Society, 1991. URL: http://dx.doi.org/10.1109/SFCS.1991.185373.
  16. Benny Porat and Ely Porat. Exact and approximate pattern matching in the streaming model. In Daniel A. Spielman, editor, Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2009), pages 315-323. IEEE Computer Society, 2009. URL: http://dx.doi.org/10.1109/FOCS.2009.11.
  17. Ely Porat and Ohad Lipsky. Improved sketching of Hamming distance with error correcting. In Bin Ma and Kaizhong Zhang, editors, Proceedings of the 18th Annual Symposium on Combinatorial Pattern Matching (CPM 2007), volume 4580 of LNCS, pages 173-182. Springer, 2007. URL: http://dx.doi.org/10.1007/978-3-540-73437-6_19.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail