Optimal Bounds for Distinct Quartics

Authors Panagiotis Charalampopoulos , Paweł Gawrychowski , Samah Ghazawi



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2024.39.pdf
  • Filesize: 0.98 MB
  • 17 pages

Document Identifiers

Author Details

Panagiotis Charalampopoulos
  • School of Computing and Mathematical Sciences, Birkbeck, University of London, UK
Paweł Gawrychowski
  • Institute of Computer Science, University of Wrocław, Poland
Samah Ghazawi
  • Department of Computer Science, University of Haifa, Israel
  • Department of Software Engineering, Braude, College of Engineering, Karmiel, Israel

Cite AsGet BibTex

Panagiotis Charalampopoulos, Paweł Gawrychowski, and Samah Ghazawi. Optimal Bounds for Distinct Quartics. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 39:1-39:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ICALP.2024.39

Abstract

A fundamental concept related to strings is that of repetitions. It has been extensively studied in many versions, from both purely combinatorial and algorithmic angles. One of the most basic questions is how many distinct squares, i.e., distinct strings of the form UU, a string of length n can contain as fragments. It turns out that this is always 𝒪(n), and the bound cannot be improved to sublinear in n [Fraenkel and Simpson, JCTA 1998]. Several similar questions about repetitions in strings have been considered, and by now we seem to have a good understanding of their repetitive structure. For higher-dimensional strings, the basic concept of periodicity has been successfully extended and applied to design efficient algorithms - it is inherently more complex than for regular strings. Extending the notion of repetitions and understanding the repetitive structure of higher-dimensional strings is however far from complete. Quartics were introduced by Apostolico and Brimkov [TCS 2000] as analogues of squares in two dimensions. Charalampopoulos, Radoszewski, Rytter, Waleń, and Zuba [ESA 2020] proved that the number of distinct quartics in an n×n 2D string is 𝒪(n²log²n) and that they can be computed in 𝒪(n²log²n) time. Gawrychowski, Ghazawi, and Landau [SPIRE 2021] constructed an infinite family of n×n 2D strings with Ω(n²log n) distinct quartics. This brings the challenge of determining asymptotically tight bounds. Here, we settle both the combinatorial and the algorithmic aspects of this question: the number of distinct quartics in an n×n 2D string is 𝒪(n²log n) and they can be computed in the worst-case optimal 𝒪(n²log n) time. As expected, our solution heavily exploits the periodic structure implied by occurrences of quartics. However, the two-dimensional nature of the problem introduces some technical challenges. Somewhat surprisingly, we overcome the final challenge for the combinatorial bound using a result of Marcus and Tardos [JCTA 2004] for permutation avoidance on matrices.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • 2D strings
  • quartics
  • repetitions
  • periodicity

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amihood Amir and Gary Benson. Efficient two-dimensional compressed matching. In Data Compression Conference, pages 279-288. IEEE Computer Society, 1992. URL: https://doi.org/10.1109/DCC.1992.227453.
  2. Amihood Amir and Gary Benson. Two-dimensional periodicity in rectangular arrays. SIAM J. Comput., 27(1):90-106, 1998. URL: https://doi.org/10.1137/S0097539795298321.
  3. Amihood Amir, Gary Benson, and Martin Farach. An alphabet independent approach to two-dimensional pattern matching. SIAM Journal on Computing, 23(2):313-323, 1994. URL: https://doi.org/10.1137/S0097539792226321.
  4. Amihood Amir, Gary Benson, and Martin Farach. Optimal two-dimensional compressed matching. J. Algorithms, 24(2):354-379, 1997. URL: https://doi.org/10.1006/JAGM.1997.0860.
  5. Amihood Amir, Gary Benson, and Martin Farach. Optimal parallel two dimensional text searching on a CREW PRAM. Inf. Comput., 144(1):1-17, 1998. URL: https://doi.org/10.1006/INCO.1998.2705.
  6. Amihood Amir, Ayelet Butman, Moshe Lewenstein, and Ely Porat. Real two dimensional scaled matching. Algorithmica, 53(3):314-336, 2009. URL: https://doi.org/10.1007/S00453-007-9021-X.
  7. Amihood Amir and Eran Chencinski. Faster two dimensional scaled matching. Algorithmica, 56(2):214-234, 2010. URL: https://doi.org/10.1007/S00453-008-9173-3.
  8. Amihood Amir and Martin Farach. Two-dimensional dictionary matching. Inf. Process. Lett., 44(5):233-239, 1992. URL: https://doi.org/10.1016/0020-0190(92)90206-B.
  9. Amihood Amir and Martin Farach. Efficient 2-dimensional approximate matching of half-rectangular figures. Inf. Comput., 118(1):1-11, 1995. URL: https://doi.org/10.1006/INCO.1995.1047.
  10. Amihood Amir, Gad M. Landau, Shoshana Marcus, and Dina Sokol. Two-dimensional maximal repetitions. Theoretical Computer Science, 812:49-61, 2020. URL: https://doi.org/10.1016/j.tcs.2019.07.006.
  11. Amihood Amir, Gad M. Landau, and Dina Sokol. Inplace 2d matching in compressed images. J. Algorithms, 49(2):240-261, 2003. URL: https://doi.org/10.1016/S0196-6774(03)00088-9.
  12. A. Apostolico and V.E. Brimkov. Fibonacci arrays and their two-dimensional repetitions. Theoretical Computer Science, 237(1-2):263-273, 2000. URL: https://doi.org/10.1016/S0304-3975(98)00182-0.
  13. Alberto Apostolico. Optimal parallel detection of squares in strings. Algorithmica, 8(4):285-319, 1992. URL: https://doi.org/10.1007/BF01758848.
  14. Alberto Apostolico and Dany Breslauer. An optimal 𝒪(log log n)-time parallel algorithm for detecting all squares in a string. SIAM J. Comput., 25(6):1318-1331, 1996. URL: https://doi.org/10.1137/S0097539793260404.
  15. Alberto Apostolico and Valentin E. Brimkov. Optimal discovery of repetitions in 2D. Discrete Applied Mathematics, 151(1-3):5-20, 2005. URL: https://doi.org/10.1016/j.dam.2005.02.019.
  16. Alberto Apostolico and Franco P. Preparata. Optimal off-line detection of repetitions in a string. Theor. Comput. Sci., 22:297-315, 1983. URL: https://doi.org/10.1016/0304-3975(83)90109-3.
  17. Alberto Apostolico and Franco P. Preparata. Data structures and algorithms for the string statistics problem. Algorithmica, 15(5):481-494, 1996. Google Scholar
  18. Theodore P. Baker. A technique for extending rapid exact-match string matching to arrays of more than one dimension. SIAM Journal on Computing, 7(4):533-541, 1978. URL: https://doi.org/10.1137/0207043.
  19. Hideo Bannai, Tomohiro I, Shunsuke Inenaga, Yuto Nakashima, Masayuki Takeda, and Kazuya Tsuruta. The "runs" theorem. SIAM Journal on Computing, 46(5):1501-1514, 2017. URL: https://doi.org/10.1137/15M1011032.
  20. Hideo Bannai, Shunsuke Inenaga, and Dominik Köppl. Computing all distinct squares in linear time for integer alphabets. In CPM, pages 22:1-22:18, 2017. URL: https://doi.org/10.4230/LIPICS.CPM.2017.22.
  21. Hideo Bannai, Shunsuke Inenaga, and Dominik Köppl. Computing all distinct squares in linear time for integer alphabets. In 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, volume 78 of LIPIcs, pages 22:1-22:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017. URL: https://doi.org/10.4230/LIPIcs.CPM.2017.22.
  22. Jean Berstel and Dominique Perrin. The origins of combinatorics on words. Eur. J. Comb., 28(3):996-1022, 2007. URL: https://doi.org/10.1016/J.EJC.2005.07.019.
  23. Richard S. Bird. Two dimensional pattern matching. Inf. Process. Lett., 6(5):168-170, 1977. URL: https://doi.org/10.1016/0020-0190(77)90017-5.
  24. S. Brlek and S. Li. On the number of squares in a finite word. arXiv, 2022. URL: https://arxiv.org/abs/2204.10204.
  25. Srecko Brlek and Shuo Li. On the number of distinct squares in finite sequences: Some old and new results. In Combinatorics on Words - 14th International Conference, WORDS 2023, pages 35-44, 2023. URL: https://doi.org/10.1007/978-3-031-33180-0_3.
  26. Gerth Stølting Brodal, Rune B. Lyngsø, Anna Östlin, and Christian N. S. Pedersen. Solving the string statistics problem in time o(n log n). In ICALP, volume 2380 of Lecture Notes in Computer Science, pages 728-739. Springer, 2002. Google Scholar
  27. P. Charalampopoulos, J. Radoszewski, W. Rytter, T. Waleń, and W. Zuba. The number of repetitions in 2D-strings. In 28th Annual European Symposium on Algorithms, ESA 2020, pages 1-18, 2020. URL: https://doi.org/10.4230/LIPICS.ESA.2020.32.
  28. Ying Choi and Tak Wah Lam. Dynamic suffix tree and two-dimensional texts management. Inf. Process. Lett., 61(4):213-220, 1997. URL: https://doi.org/10.1016/S0020-0190(97)00018-5.
  29. Raphaël Clifford, Allyx Fontaine, Tatiana Starikovskaya, and Hjalte Wedel Vildhøj. Dynamic and approximate pattern matching in 2D. In SPIRE, pages 133-144, 2016. URL: https://doi.org/10.1007/978-3-319-46049-9_13.
  30. Maxime Crochemore. An optimal algorithm for computing the repetitions in a word. Information Processing Letters, 12(5):244-250, 1981. Google Scholar
  31. Maxime Crochemore, Leszek Gasieniec, Ramesh Hariharan, S. Muthukrishnan, and Wojciech Rytter. A constant time optimal parallel algorithm for two-dimensional pattern matching. SIAM J. Comput., 27(3):668-681, 1998. URL: https://doi.org/10.1137/S0097539795280068.
  32. Maxime Crochemore, Leszek Gasieniec, Wojciech Plandowski, and Wojciech Rytter. Two-dimensional pattern matching in linear time and small space. In STACS 95, 12th Annual Symposium on Theoretical Aspects of Computer Science, pages 181-192, 1995. URL: https://doi.org/10.1007/3-540-59042-0_72.
  33. Maxime Crochemore, Christophe Hancart, and Thierry Lecroq. Algorithms on strings. Cambridge University Press, 2007. Google Scholar
  34. Maxime Crochemore and Lucian Ilie. Maximal repetitions in strings. J. Comput. Syst. Sci., 74(5):796-807, 2008. URL: https://doi.org/10.1016/j.jcss.2007.09.003.
  35. Maxime Crochemore, Lucian Ilie, and Liviu Tinta. The "runs" conjecture. Theor. Comput. Sci., 412(27):2931-2941, 2011. URL: https://doi.org/10.1016/j.tcs.2010.06.019.
  36. Maxime Crochemore, Costas S. Iliopoulos, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Extracting powers and periods in a word from its runs structure. Theor. Comput. Sci., 521:29-41, 2014. URL: https://doi.org/10.1016/J.TCS.2013.11.018.
  37. Maxime Crochemore and Wojciech Rytter. Efficient parallel algorithms to test square-freeness and factorize strings. Inf. Process. Lett., 38(2):57-60, 1991. URL: https://doi.org/10.1016/0020-0190(91)90223-5.
  38. Maxime Crochemore and Wojciech Rytter. Usefulness of the Karp-Miller-Rosenberg algorithm in parallel computations on strings and arrays. Theor. Comput. Sci., 88(1):59-82, 1991. URL: https://doi.org/10.1016/0304-3975(91)90073-B.
  39. Maxime Crochemore and Wojciech Rytter. On linear-time alphabet-independent 2-dimensional pattern matching. In LATIN '95: Theoretical Informatics, pages 220-229, 1995. URL: https://doi.org/10.1007/3-540-59175-3_91.
  40. Maxime Crochemore and Wojciech Rytter. Squares, cubes, and time-space efficient string searching. Algorithmica, 13(5):405-425, 1995. URL: https://doi.org/10.1007/BF01190846.
  41. A. Deza, F. Franek, and A. Thierry. How many double squares can a string contain? Discrete Applied Mathematics, 180:52-69, 2015. URL: https://doi.org/10.1016/J.DAM.2014.08.016.
  42. R. P. Dilworth. A decomposition theorem for partially ordered sets. Annals of Mathematics, 51(1):161-166, 1950. URL: http://www.jstor.org/stable/1969503.
  43. Jonas Ellert and Johannes Fischer. Linear Time Runs Over General Ordered Alphabets. 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021), pages 63:1-63:16, 2021. URL: https://doi.org/10.4230/LIPICS.ICALP.2021.63.
  44. Jonas Ellert, Pawel Gawrychowski, and Garance Gourdel. Optimal square detection over general alphabets. In Proceedings of the 2023 ACM-SIAM Symposium on Discrete Algorithms, SODA 2023, pages 5220-5242. SIAM, 2023. URL: https://doi.org/10.1137/1.9781611977554.CH189.
  45. Nathan J. Fine and Herbert S. Wilf. Uniqueness theorems for periodic functions. Proceedings of the American Mathematical Society, 16(1):109-114, 1965. URL: https://doi.org/10.2307/2034009.
  46. Aviezri S. Fraenkel and Jamie Simpson. How many squares can a string contain? Journal of Combinatorial Theory, Series A, 82(1):112-120, 1998. URL: https://doi.org/10.1006/jcta.1997.2843.
  47. Zvi Galil and Kunsoo Park. Alphabet-independent two-dimensional witness computation. SIAM J. Comput., 25(5):907-935, 1996. URL: https://doi.org/10.1137/S0097539792241941.
  48. P. Gawrychowski, S. Ghazawi, and Gad M. Landau. Lower bounds for the number of repetitions in 2D strings. In SPIRE 2021, pages 179-192, 2021. URL: https://doi.org/10.1007/978-3-030-86692-1_15.
  49. Raffaele Giancarlo. A generalization of the suffix tree to square matrices, with applications. SIAM J. Comput., 24(3):520-562, 1995. URL: https://doi.org/10.1137/S0097539792231982.
  50. Raffaele Giancarlo and Roberto Grossi. On the construction of classes of suffix trees for square matrices: Algorithms and applications. Inf. Comput., 130(2):151-182, 1996. URL: https://doi.org/10.1006/INCO.1996.0087.
  51. Mathieu Giraud. Not so many runs in strings. In Language and Automata Theory and Applications, Second International Conference, LATA 2008, volume 5196, pages 232-239. Springer, 2008. URL: https://doi.org/10.1007/978-3-540-88282-4_22.
  52. Mathieu Giraud. Asymptotic behavior of the numbers of runs and microruns. Inf. Comput., 207(11):1221-1228, 2009. URL: https://doi.org/10.1016/j.ic.2009.02.007.
  53. Dan Gusfield. Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology. Cambridge University Press, 1997. Google Scholar
  54. Dan Gusfield and Jens Stoye. Linear time algorithms for finding and representing all the tandem repeats in a string. J. Comput. Syst. Sci., 69(4):525-546, 2004. URL: https://doi.org/10.1016/J.JCSS.2004.03.004.
  55. Jin-Ju Hong and Gen-Huey Chen. Efficient on-line repetition detection. Theor. Comput. Sci., 407(1-3):554-563, 2008. URL: https://doi.org/10.1016/j.tcs.2008.08.038.
  56. Ramana M. Idury and Alejandro A. Schäffer. Multiple matching of rectangular patterns. Inf. Comput., 117(1):78-90, 1995. URL: https://doi.org/10.1006/INCO.1995.1030.
  57. L. Ilie. A simple proof that a word of length n has at most 2n distinct squares. Journal of Combinatorial Theory, Series A, 112(1):163-164, 2005. URL: https://doi.org/10.1016/J.JCTA.2005.01.006.
  58. L. Ilie. A note on the number of squares in a word. Theoretical Computer Science, 380(3):373-376, 2007. URL: https://doi.org/10.1016/J.TCS.2007.03.025.
  59. Juha Kärkkäinen and Esko Ukkonen. Two- and higher-dimensional pattern matching in optimal expected time. SIAM J. Comput., 29(2):571-589, 1999. URL: https://doi.org/10.1137/S0097539794275872.
  60. Donald E. Knuth, James H. Morris Jr., and Vaughan R. Pratt. Fast pattern matching in strings. SIAM Journal on Computing, 6(2):323-350, 1977. URL: https://doi.org/10.1137/0206024.
  61. Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Fast algorithm for partial covers in words. Algorithmica, 73(1):217-233, 2015. Google Scholar
  62. Roman M. Kolpakov and Gregory Kucherov. Finding maximal repetitions in a word in linear time. In 40th Annual Symposium on Foundations of Computer Science, FOCS 1999, pages 596-604. IEEE Computer Society, 1999. URL: https://doi.org/10.1109/SFFCS.1999.814634.
  63. Dmitry Kosolobov. Online square detection. CoRR, abs/1411.2022, 2014. URL: https://arxiv.org/abs/1411.2022.
  64. Dmitry Kosolobov. Online detection of repetitions with backtracking. In Combinatorial Pattern Matching - 26th Annual Symposium, CPM 2015, volume 9133, pages 295-306. Springer, 2015. URL: https://doi.org/10.1007/978-3-319-19929-0_25.
  65. N. H. Lam. On the number of squares in a string. AdvOL-Report 2, 2013. Google Scholar
  66. Ho-fung Leung, Zeshan Peng, and Hing-Fung Ting. An efficient algorithm for online square detection. Theor. Comput. Sci., 363(1):69-75, 2006. URL: https://doi.org/10.1016/J.TCS.2006.06.011.
  67. Michael G. Main and Richard J. Lorentz. An 𝒪(nlog n) algorithm for finding all repetitions in a string. J. Algorithms, 5(3):422-432, 1984. URL: https://doi.org/10.1016/0196-6774(84)90021-X.
  68. Adam Marcus and Gábor Tardos. Excluded permutation matrices and the stanley-wilf conjecture. J. Comb. Theory, Ser. A, 107(1):153-160, 2004. URL: https://doi.org/10.1016/J.JCTA.2004.04.002.
  69. Shoshana Neuburger and Dina Sokol. Succinct 2D dictionary matching. Algorithmica, 65(3):662-684, 2013. URL: https://doi.org/10.1007/S00453-012-9615-9.
  70. Simon J. Puglisi, Jamie Simpson, and William F. Smyth. How many runs can a string contain? Theor. Comput. Sci., 401(1-3):165-171, 2008. URL: https://doi.org/10.1016/J.TCS.2008.04.020.
  71. Jakub Radoszewski. Linear time construction of cover suffix tree and applications. In ESA, volume 274 of LIPIcs, pages 89:1-89:17. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023. Google Scholar
  72. Azriel Rosenfeld and Avinash C. Kak. Digital Picture Processing: Volume 1 and 2. Computer Science and Applied Mathematics. Academic Press, Orlando, FL, 2 edition, 1982. Google Scholar
  73. Wojciech Rytter. The number of runs in a string: Improved analysis of the linear upper bound. In STACS 2006, 23rd Annual Symposium on Theoretical Aspects of Computer Science, pages 184-195, 2006. URL: https://doi.org/10.1007/11672142_14.
  74. A. Thierry. A proof that a word of length n has less than 1.5n distinct squares. arXiv, 2020. URL: https://arxiv.org/abs/2001.02996.
  75. A. Thue. Über unendliche Zeichenreihen. Norske Vid. Selsk. Skr., I Mat.-Nat. Kl., Christiania, 7:1-22, 1906. Google Scholar