Order-Preserving Squares in Strings

Authors Paweł Gawrychowski , Samah Ghazawi, Gad M. Landau



PDF
Thumbnail PDF

File

LIPIcs.CPM.2023.13.pdf
  • Filesize: 2.13 MB
  • 19 pages

Document Identifiers

Author Details

Paweł Gawrychowski
  • Institute of Computer Science, University of Wrocław, Poland
Samah Ghazawi
  • Department of Computer Science, University of Haifa, Israel
Gad M. Landau
  • Department of Computer Science, University of Haifa, Israel
  • Department of Computer Science and Engineering, NYU Tandon School of Engineering, New York University, Brooklyn, NY, USA

Cite AsGet BibTex

Paweł Gawrychowski, Samah Ghazawi, and Gad M. Landau. Order-Preserving Squares in Strings. In 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 259, pp. 13:1-13:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.CPM.2023.13

Abstract

An order-preserving square in a string is a fragment of the form uv where u ≠ v and u is order-isomorphic to v. We show that a string w of length n over an alphabet of size σ contains 𝒪(σn) order-preserving squares that are distinct as words. This improves the upper bound of 𝒪(σ²n) by Kociumaka, Radoszewski, Rytter, and Waleń [TCS 2016]. Further, for every σ and n we exhibit a string with Ω(σn) order-preserving squares that are distinct as words, thus establishing that our upper bound is asymptotically tight. Finally, we design an 𝒪(σn) time algorithm that outputs all order-preserving squares that occur in a given string and are distinct as words. By our lower bound, this is optimal in the worst case.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • repetitions
  • distinct squares
  • order-isomorphism

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. S. Brlek and S. Li. On the number of squares in a finite word. arXiv, 2022. URL: https://arxiv.org/abs/2204.10204.
  2. R. Cole and R. Hariharan. Faster suffix tree construction with missing suffix links. STOC, pages 407-415, 2000. Google Scholar
  3. M. Crochemore, C. S. Iliopoulos, T. Kociumaka, M. Kubica, A. Langiu, S. P. Pissis, J. Radoszewski, W. Rytter, and T. Waleń. Order-preserving incomplete suffix trees and order-preserving indexes. SPIRE, 8214:84-95, 2013. Google Scholar
  4. M. Crochemore, C. S. Iliopoulos, T. Kociumaka, M. Kubica, A. Langiu, S. P. Pissis, J. Radoszewski, W. Rytter, and T. Waleń. Order-preserving indexing. Theoretical Computer Science, 638:122-135, 2016. Google Scholar
  5. M. Crochemore and W. Rytter. Squares, cubes, and time-space efficient string searching. Algorithmica, 13:405-425, 1995. Google Scholar
  6. L. J. Cummings and W. F. Smyth. Weak repetitions in strings. J. Combinatorial Math. Combinatorial Comput, 24:33-48, 1997. Google Scholar
  7. A. Deza, F. Franek, and A. Thierry. How many double squares can a string contain? Discrete Applied Mathematics, 180:52-69, 2015. Google Scholar
  8. P. Erdős. Some unsolved problems. Magy. Tud. Akad. Mat. Kut. Intéz. Közl., 6:221-254, 1961. Google Scholar
  9. A. A. Evdokimov. Strongly asymmetric sequences generated by a finite number of symbols. Dokl. Akad. Nauk SSSR, 179(6):1268-1271, 1968. Google Scholar
  10. A. S. Fraenkel and J. Simpson. How many squares can a string contain? Combinatorial Theory, Series A, 82(1):112-120, 1998. Google Scholar
  11. P. Gawrychowski. Pattern matching in Lempel-Ziv compressed strings: Fast, simple, and deterministic. ESA, 6942:421-432, 2011. Google Scholar
  12. P. Gawrychowski. Pattern matching in Lempel-Ziv compressed strings: fast, simple, and deterministic. arXive, 2011. URL: https://arxiv.org/abs/1104.4203.
  13. G. Gourdel, T. Kociumaka, J. Radoszewski, W. Rytter, A. Shur, and T. Waleń. String periods in the order-preserving model. Information and Computation, 270:104463, 2020. Google Scholar
  14. D. Gusfield and J. Stoye. Linear time algorithms for finding and representing all the tandem repeats in a string. Computer and System Sciences, 69(4):525-546, 2004. Google Scholar
  15. D. Harel and R. E. Tarjan. Fast algorithms for finding nearest common ancestors. SIAM J. Comput., 13(2):338-355, 1984. Google Scholar
  16. M. Huova, J. Karhumäki, and A. Saarela. Problems in between words and abelian words: k-abelian avoidability. Theoretical Computer Science, 454:172-177, 2012. Google Scholar
  17. L. Ilie. A simple proof that a word of length n has at most 2n distinct squares. Journal of Combinatorial Theory, Series A, 112(1):163-164, 2005. Google Scholar
  18. L. Ilie. A note on the number of squares in a word. Theoretical Computer Science, 380(3):373-376, 2007. Google Scholar
  19. V. Keränen. Abelian squares are avoidable on 4 letters. Automata, Languages and Programming, pages 41-52, 1992. Google Scholar
  20. T. Kociumaka, J. Radoszewski, W. Rytter, and T. Waleń. Maximum number of distinct and nonequivalent nonstandard squares in a word. Theoretical Computer Science, 648(C):84-95, 2016. Google Scholar
  21. T. Kociumaka, J. Radoszewski, and B. Wiśniewski. Subquadratic-time algorithms for abelian stringology problems. Mathematical Aspects of Computer and Information Sciences, pages 320-334, 2016. Google Scholar
  22. M. Kubica, T. Kulczyński, J. Radoszewski, W. Rytter, and T. Waleń. A linear time algorithm for consecutive permutation pattern matching. Information Processing Letters, 113(12):430-433, 2013. Google Scholar
  23. N. H. Lam. On the number of squares in a string. AdvOL-Report 2, 2013. Google Scholar
  24. F. Manea and S. Seki. Square-density increasing mappings. In 10th WORDS, 9304:160-169, 2015. Google Scholar
  25. Y. Matsuoka, T. Aoki, S. Inenaga, H. Bannai, and M. Takeda. Generalized pattern matching and periodicity under substring consistent equivalence relations. Theoretical Computer Science, 656:225-233, 2016. Google Scholar
  26. P. A. B. Pleasants. Non-repetitive sequences. Mathematical Proceedings of the Cambridge Philosophical Society, 68(2):267-274, 1970. Google Scholar
  27. A. Thierry. A proof that a word of length n has less than 1.5n distinct squares. arXiv, 2020. URL: https://arxiv.org/abs/2001.02996.
  28. A. Thue. Über unendliche Zeichenreihen. Norske Vid Selsk. Skr. I Mat-Nat Kl.(Christiana), 7:1-22, 1906. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail