Streaming Periodicity with Mismatches

Authors Funda Ergün, Elena Grigorescu, Erfan Sadeqi Azer, Samson Zhou



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2017.42.pdf
  • Filesize: 0.59 MB
  • 21 pages

Document Identifiers

Author Details

Funda Ergün
Elena Grigorescu
Erfan Sadeqi Azer
Samson Zhou

Cite As Get BibTex

Funda Ergün, Elena Grigorescu, Erfan Sadeqi Azer, and Samson Zhou. Streaming Periodicity with Mismatches. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 81, pp. 42:1-42:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017) https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2017.42

Abstract

We study the problem of finding all k-periods of a length-n string S, presented as a data stream. S is said to have k-period p if its prefix of length n-p differs from its suffix of length n-p in at most k locations.

We give a one-pass streaming algorithm that computes the k-periods of a string S using poly(k, log n) bits of space, for k-periods of length at most n/2. We also present a two-pass streaming algorithm that computes k-periods of S using poly(k, log n) bits of space, regardless of period length. We complement these results with comparable lower bounds.

Subject Classification

Keywords
  • String algorithms
  • Streaming algorithms
  • Pattern matching
  • Randomized algorithms
  • Sublinear algorithms

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amihood Amir, Estrella Eisenberg, and Avivit Levy. Approximate periodicity. Algorithms and Computation, pages 25-36, 2010. Google Scholar
  2. Alexandr Andoni, Assaf Goldberger, Andrew McGregor, and Ely Porat. Homomorphic fingerprints under misalignments: sketching edit and shift distances. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 931-940, 2013. Google Scholar
  3. Alberto Apostolico and Zvi Galil, editors. Pattern Matching Algorithms. Oxford University Press, Oxford, UK, 1997. Google Scholar
  4. Petra Berenbrink, Funda Ergün, Frederik Mallmann-Trenn, and Erfan Sadeqi Azer. Palindrome recognition in the streaming model. In 31st International Symposium on Theoretical Aspects of Computer Science (STACS), pages 149-161, 2014. Google Scholar
  5. Raphaël Clifford, Klim Efremenko, Ely Porat, and Amir Rothschild. From coding theory to efficient pattern matching. In Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 778-784, 2009. Google Scholar
  6. Raphaël Clifford, Allyx Fontaine, Ely Porat, Benjamin Sach, and Tatiana A. Starikovskaya. Dictionary matching in a stream. In Algorithms - ESA 23rd Annual European Symposium, Proceedings, pages 361-372, 2015. Google Scholar
  7. Raphaël Clifford, Allyx Fontaine, Ely Porat, Benjamin Sach, and Tatiana A. Starikovskaya. The k-mismatch problem revisited. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 2039-2052, 2016. Google Scholar
  8. Raphaël Clifford, Markus Jalsenius, Ely Porat, and Benjamin Sach. Space lower bounds for online pattern matching. Theoretical Computer Science, 483:68-74, 2013. Google Scholar
  9. Michael S. Crouch and Andrew McGregor. Periodicity and cyclic shifts via linear sketches. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques - 14th International Workshop, APPROX, and 15th International Workshop, RANDOM. Proceedings, pages 158-170, 2011. Google Scholar
  10. Mohamed G. Elfeky, Walid G. Aref, and Ahmed K. Elmagarmid. STAGGER: periodicity mining of data streams using expanding sliding windows. In Proceedings of the 6th IEEE International Conference on Data Mining (ICDM), pages 188-199, 2006. Google Scholar
  11. Funda Ergün, Elena Grigorescu, Erfan Sadeqi Azer, and Samson Zhou. Streaming periodicity with mismatches, 2017. URL: http://homes.soic.indiana.edu/fergun/PUBLICATIONS/mismatchperiodicity.pdf.
  12. Funda Ergün, Hossein Jowhari, and Mert Saglam. Periodicity in streams. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 13th International Workshop, APPROX 2010, and 14th International Workshop, RANDOM 2010. Proceedings, pages 545-559, 2010. Google Scholar
  13. Funda Ergün, S. Muthukrishnan, and Süleyman Cenk Sahinalp. Periodicity testing with sublinear samples and space. ACM Trans. Algorithms, 6(2):43:1-43:14, 2010. Google Scholar
  14. Zvi Galil and Joel Seiferas. Time-space-optimal string matching. Journal of Computer and System Sciences, 26(3):280-294, 1983. Google Scholar
  15. Pawel Gawrychowski. Optimal pattern matching in LZW compressed strings. ACM Transactions on Algorithms (TALG), 9(3):25, 2013. Google Scholar
  16. Pawel Gawrychowski, Oleg Merkurev, Arseny M. Shur, and Przemyslaw Uznanski. Tight tradeoffs for real-time approximation of longest palindromes in streams. In 27th Annual Symposium on Combinatorial Pattern Matching, CPM, pages 18:1-18:13, 2016. Google Scholar
  17. Shay Golan, Tsvi Kopelowitz, and Ely Porat. Streaming Pattern Matching with d Wildcards. In 24th Annual European Symposium on Algorithms (ESA), pages 44:1-44:16, 2016. Google Scholar
  18. Elena Grigorescu, Erfan Sadeqi Azer, and Samson Zhou. Streaming for aibohphobes: Longest palindrome with mismatches. CoRR, abs/1705.01887, 2017. URL: http://arxiv.org/abs/1705.01887.
  19. Piotr Indyk, Nick Koudas, and S. Muthukrishnan. Identifying representative trends in massive time series data sets using sketches. In VLDB, Proceedings of 26th International Conference on Very Large Data Bases, pages 363-372, 2000. Google Scholar
  20. Richard M. Karp and Michael O. Rabin. Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development, 31(2):249-260, 1987. Google Scholar
  21. Donald E. Knuth, James H. Morris Jr., and Vaughan R. Pratt. Fast pattern matching in strings. SIAM J. Comput., 6(2):323-350, 1977. Google Scholar
  22. Donald E. Knuth, James H. Morris Jr., and Vaughan R. Pratt. Fast pattern matching in strings. SIAM journal on computing, 6(2):323-350, 1977. Google Scholar
  23. Oded Lachish and Ilan Newman. Testing periodicity. Algorithmica, 60(2):401-420, 2011. Google Scholar
  24. Peter Bro Miltersen, Noam Nisan, Shmuel Safra, and Avi Wigderson. On data structures and asymmetric communication complexity. In Proceedings of the Twenty-Seventh Annual ACM Symposium on Theory of Computing, pages 103-111, 1995. Google Scholar
  25. Benny Porat and Ely Porat. Exact and approximate pattern matching in the streaming model. In 50th Annual IEEE Symposium on Foundations of Computer Science, FOCS, pages 315-323, 2009. Google Scholar
  26. Ely Porat and Ohad Lipsky. Improved sketching of hamming distance with error correcting. In Annual Symposium on Combinatorial Pattern Matching, pages 173-182, 2007. Google Scholar
  27. Jakub Radoszewski and Tatiana Starikovskaya. Streaming k-mismatch with data recovery and applications. arXiv preprint arXiv:1607.05626, 2016. Google Scholar
  28. Andrew Chi-Chih Yao. Probabilistic computations: Toward a unified measure of complexity (extended abstract). In 18th Annual Symposium on Foundations of Computer Science, FOCS, pages 222-227, 1977. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail