Linear-Time Computation of Cyclic Roots and Cyclic Covers of a String

Authors Costas S. Iliopoulos , Tomasz Kociumaka , Jakub Radoszewski , Wojciech Rytter , Tomasz Waleń , Wiktor Zuba



PDF
Thumbnail PDF

File

LIPIcs.CPM.2023.15.pdf
  • Filesize: 0.94 MB
  • 15 pages

Document Identifiers

Author Details

Costas S. Iliopoulos
  • Department of Informatics, King’s College London, London, UK
Tomasz Kociumaka
  • Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken, Germany
Jakub Radoszewski
  • Institute of Informatics, University of Warsaw, Poland
Wojciech Rytter
  • Institute of Informatics, University of Warsaw, Poland
Tomasz Waleń
  • Institute of Informatics, University of Warsaw, Poland
Wiktor Zuba
  • CWI, Amsterdam, The Netherlands

Cite AsGet BibTex

Costas S. Iliopoulos, Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba. Linear-Time Computation of Cyclic Roots and Cyclic Covers of a String. In 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 259, pp. 15:1-15:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.CPM.2023.15

Abstract

Cyclic versions of covers and roots of a string are considered in this paper. A prefix V of a string S is a cyclic root of S if S is a concatenation of cyclic rotations of V. A prefix V of S is a cyclic cover of S if the occurrences of the cyclic rotations of V cover all positions of S. We present 𝒪(n)-time algorithms computing all cyclic roots (using number-theoretic tools) and all cyclic covers (using tools related to seeds) of a length-n string over an integer alphabet. Our results improve upon 𝒪(n log log n) and 𝒪(n log n) time complexities of recent algorithms of Grossi et al. (WALCOM 2023) for the respective problems and provide novel approaches to the problems. As a by-product, we obtain an optimal data structure for Internal Circular Pattern Matching queries that generalize Internal Pattern Matching and Cyclic Equivalence queries of Kociumaka et al. (SODA 2015).

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • cyclic cover
  • cyclic root
  • circular pattern matching
  • internal pattern matching

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Lorraine A. K. Ayad, Carl Barton, and Solon P. Pissis. A faster and more accurate heuristic for cyclic edit distance computation. Pattern Recognition Letters, 88:81-87, 2017. URL: http://dx.doi.org/10.1016/j.patrec.2017.01.018.
  2. Lorraine A. K. Ayad and Solon P. Pissis. MARS: improving multiple circular sequence alignment using refined sequences. BMC Genomics, 18(1):86, 2017. URL: http://dx.doi.org/10.1186/s12864-016-3477-5.
  3. Carl Barton, Costas S. Iliopoulos, Ritu Kundu, Solon P. Pissis, Ahmad Retha, and Fatima Vayani. Accurate and efficient methods to improve multiple circular sequence alignment. In Evripidis Bampis, editor, Experimental Algorithms, SEA 2015, volume 9125 of Lecture Notes in Computer Science, pages 247-258. Springer, 2015. URL: http://dx.doi.org/10.1007/978-3-319-20086-6_19.
  4. Bastien Cazaux, Rodrigo Cánovas, and Eric Rivals. Shortest DNA cyclic cover in compressed space. In Ali Bilgin, Michael W. Marcellin, Joan Serra-Sagristà, and James A. Storer, editors, 2016 Data Compression Conference, DCC 2016, pages 536-545. IEEE, 2016. URL: http://dx.doi.org/10.1109/DCC.2016.79.
  5. Bastien Cazaux and Eric Rivals. A linear time algorithm for shortest cyclic cover of strings. Journal of Discrete Algorithms, 37:56-67, 2016. URL: http://dx.doi.org/10.1016/j.jda.2016.05.001.
  6. Bastien Cazaux and Eric Rivals. The power of greedy algorithms for approximating Max-ATSP, Cyclic Cover, and superstrings. Discrete Applied Mathematics, 212:48-60, 2016. URL: http://dx.doi.org/10.1016/j.dam.2015.06.003.
  7. Panagiotis Charalampopoulos, Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, and Wiktor Zuba. Circular pattern matching with k mismatches. Journal of Computer and System Sciences, 115:73-85, 2021. URL: http://dx.doi.org/10.1016/j.jcss.2020.07.003.
  8. Panagiotis Charalampopoulos, Tomasz Kociumaka, Jakub Radoszewski, Solon P. Pissis, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba. Approximate circular pattern matching. In Shiri Chechik, Gonzalo Navarro, Eva Rotenberg, and Grzegorz Herman, editors, 30th Annual European Symposium on Algorithms, ESA 2022, volume 244 of LIPIcs, pages 35:1-35:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. URL: http://dx.doi.org/10.4230/LIPIcs.ESA.2022.35.
  9. Panagiotis Charalampopoulos, Tomasz Kociumaka, and Philip Wellnitz. Faster approximate pattern matching: A unified approach. In Sandy Irani, editor, 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, pages 978-989. IEEE, 2020. URL: http://dx.doi.org/10.1109/FOCS46700.2020.00095.
  10. Kuei-Hao Chen, Guan-Shieng Huang, and Richard Chia-Tung Lee. Bit-parallel algorithms for exact circular string matching. The Computer Journal, 57(5):731-743, March 2013. URL: http://dx.doi.org/10.1093/comjnl/bxt023.
  11. Maxime Crochemore, Costas S. Iliopoulos, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, and Wiktor Zuba. Shortest covers of all cyclic shifts of a string. Theoretical Computer Science, 866:70-81, 2021. URL: http://dx.doi.org/10.1016/j.tcs.2021.03.011.
  12. Maxime Crochemore, Costas S. Iliopoulos, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszýnski, Tomasz Waleń, and Wiktor Zuba. Linear-time computation of shortest covers of all rotations of a string. In Hideo Bannai and Jan Holub, editors, 33rd Annual Symposium on Combinatorial Pattern Matching, CPM 2022, volume 223 of LIPIcs, pages 22:1-22:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. URL: http://dx.doi.org/10.4230/LIPIcs.CPM.2022.22.
  13. Maxime Crochemore, Thierry Lecroq, and Wojciech Rytter. 125 Problems in Text Algorithms: With Solutions. Cambridge University Press, 2021. Google Scholar
  14. Kimmo Fredriksson and Szymon Grabowski. Average-optimal string matching. Journal of Discrete Algorithms, 7(4):579-594, 2009. URL: http://dx.doi.org/10.1016/j.jda.2008.09.001.
  15. Paweł Gawrychowski. Pattern matching in Lempel-Ziv compressed strings: Fast, simple, and deterministic. In Camil Demetrescu and Magnús M. Halldórsson, editors, Algorithms - ESA 2011 - 19th Annual European Symposium, volume 6942 of Lecture Notes in Computer Science, pages 421-432. Springer, 2011. URL: http://dx.doi.org/10.1007/978-3-642-23719-5_36.
  16. Roberto Grossi, Costas S. Iliopoulos, Jesper Jansson, Zara Lim, Wing-Kin Sung, and Wiktor Zuba. Finding the cyclic covers of a string. In Chun-Cheng Lin, Bertrand M. T. Lin, and Giuseppe Liotta, editors, Algorithms and Computation - 17th International Conference and Workshops, WALCOM 2023, volume 13973 of Lecture Notes in Computer Science, pages 139-150. Springer, 2023. URL: http://dx.doi.org/10.1007/978-3-031-27051-2_13.
  17. Roberto Grossi, Costas S. Iliopoulos, Robert Mercas, Nadia Pisanti, Solon P. Pissis, Ahmad Retha, and Fatima Vayani. Circular sequence comparison: algorithms and applications. Algorithms for Molecular Biology, 11:12, 2016. URL: http://dx.doi.org/10.1186/s13015-016-0076-6.
  18. Godfrey H. Hardy and Edward M. Wright. An Introduction to the Theory of Numbers. Oxford University Press, 1960. Google Scholar
  19. Costas S. Iliopoulos, Solon P. Pissis, and M. Sohel Rahman. Searching and indexing circular patterns. In Mourad Elloumi, editor, Algorithms for Next-Generation Sequencing Data, Techniques, Approaches, and Applications, pages 77-90. Springer, 2017. URL: http://dx.doi.org/10.1007/978-3-319-59826-0_3.
  20. Tomasz Kociumaka. Efficient Data Structures for Internal Queries in Texts. PhD thesis, University of Warsaw, October 2018. URL: https://www.mimuw.edu.pl/~kociumaka/files/phd.pdf.
  21. Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. A linear-time algorithm for seeds computation. ACM Transactions on Algorithms, 16(2):27:1-27:23, 2020. URL: http://dx.doi.org/10.1145/3386369.
  22. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Internal pattern matching queries in a text and applications. In Piotr Indyk, editor, Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, pages 532-551. SIAM, 2015. URL: http://dx.doi.org/10.1137/1.9781611973730.36.
  23. M. Lothaire. Applied Combinatorics on Words. Cambridge University Press, 2005. URL: http://www.cambridge.org/gb/knowledge/isbn/item1172552/?site_locale=en_GB.
  24. Andrés Marzal, Ramón Mollineda, Guillermo Penis, and Enrique Vidal. Cyclic string matching: Efficient exact and approximate algorithms. In Dechang Chen and Xiuzhen Cheng, editors, Pattern Recognition and String Matching, pages 477-497. Springer US, Boston, MA, 2002. URL: http://dx.doi.org/10.1007/978-1-4613-0231-5_19.
  25. Dennis W. G. Moore and William F. Smyth. An optimal algorithm to compute all the covers of a string. Information Processing Letters, 50(5):239-246, 1994. URL: http://dx.doi.org/10.1016/0020-0190(94)00045-X.
  26. Dennis W. G. Moore and William F. Smyth. A correction to "An optimal algorithm to compute all the covers of a string". Information Processing Letters, 54(2):101-103, 1995. URL: http://dx.doi.org/10.1016/0020-0190(94)00235-Q.
  27. Vicente Palazón-González and Andrés Marzal. On the dynamic time warping of cyclic sequences for shape retrieval. Image Vision Computing, 30(12):978-990, 2012. URL: http://dx.doi.org/10.1016/j.imavis.2012.08.012.
  28. Vicente Palazón-González and Andrés Marzal. Speeding up the cyclic edit distance using LAESA with early abandon. Pattern Recognition Letters, 62:1-7, 2015. URL: http://dx.doi.org/10.1016/j.patrec.2015.04.013.
  29. Vicente Palazón-González, Andrés Marzal, and Juan Miguel Vilar. On hidden Markov models and cyclic strings for shape recognition. Pattern Recognition, 47(7):2490-2504, 2014. URL: http://dx.doi.org/10.1016/j.patcog.2014.01.018.
  30. Michael J. Tisza, Diana V. Pastrana, Nicole L. Welch, Brittany Stewart, Alberto Peretti, Gabriel J. Starrett, Yuk-Ying S. Pang, Siddharth R. Krishnamurthy, Patricia A. Pesavento, David H. McDermott, et al. Discovery of several thousand highly diverse circular DNA viruses. eLife, 9:e51971, 2020. URL: http://dx.doi.org/10.7554/eLife.51971.
  31. Edward K. Wagner, Martinez J. Hewlett, David C. Bloom, and David Camerini. Basic Virology, volume 3. Blackwell Science Malden, MA, 1999. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail