Minimal Suffix and Rotation of a Substring in Optimal Time

Author Tomasz Kociumaka



PDF
Thumbnail PDF

File

LIPIcs.CPM.2016.28.pdf
  • Filesize: 0.51 MB
  • 12 pages

Document Identifiers

Author Details

Tomasz Kociumaka

Cite AsGet BibTex

Tomasz Kociumaka. Minimal Suffix and Rotation of a Substring in Optimal Time. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 28:1-28:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/LIPIcs.CPM.2016.28

Abstract

For a text of length $n$ given in advance, the substring minimal suffix queries ask to determine the lexicographically minimal non-empty suffix of a substring specified by the location of its occurrence in the text. We develop a data structure answering such queries optimally: in constant time after linear-time preprocessing. This improves upon the results of Babenko et al. (CPM 2014), whose trade-off solution is characterized by Theta(n log n) product of these time complexities. Next, we extend our queries to support concatenations of O(1) substrings, for which the construction and query time is preserved. We apply these generalized queries to compute lexicographically minimal and maximal rotations of a given substring in constant time after linear-time preprocessing. Our data structures mainly rely on properties of Lyndon words and Lyndon factorizations. We combine them with further algorithmic and combinatorial tools, such as fusion trees and the notion of order isomorphism of strings.
Keywords
  • minimal suffix
  • minimal rotation
  • Lyndon factorization
  • substring canon- ization
  • substring queries

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amihood Amir, Alberto Apostolico, Gad M. Landau, Avivit Levy, Moshe Lewenstein, and Ely Porat. Range LCP. J. Comput. Syst. Sci., 80(7):1245-1253, 2014. URL: http://dx.doi.org/10.1016/j.jcss.2014.02.010.
  2. Amihood Amir, Moshe Lewenstein, and Sharma V. Thankachan. Range LCP queries revisited. In Costas S. Iliopoulos, Simon J. Puglisi, and Emine Yilmaz, editors, String Processing and Information Retrieval, SPIRE 2015, volume 9309 of LNCS, pages 350-361. Springer, 2015. URL: http://dx.doi.org/10.1007/978-3-319-23826-5.
  3. Alberto Apostolico and Maxime Crochemore. Optimal canonization of all substrings of a string. Inf. Comput., 95(1):76-95, 1991. URL: http://dx.doi.org/10.1016/0890-5401(91)90016-U.
  4. Maxim Babenko, Paweł Gawrychowski, Tomasz Kociumaka, Ignat Kolesnichenko, and Tatiana Starikovskaya. Computing minimal and maximal suffixes of a substring. Theor. Comput. Sci., 2015. In press. URL: http://dx.doi.org/10.1016/j.tcs.2015.08.023.
  5. Maxim Babenko, Paweł Gawrychowski, Tomasz Kociumaka, and Tatiana Starikovskaya. Wavelet trees meet suffix trees. In Piotr Indyk, editor, 26th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, pages 572-591. SIAM, 2015. URL: http://dx.doi.org/10.1137/1.9781611973730.39.
  6. Hideo Bannai, Tomohiro I, Shunsuke Inenaga, Yuto Nakashima, Masayuki Takeda, and Kazuya Tsuruta. The "runs" theorem, 2015. URL: http://arxiv.org/abs/1406.0263v7.
  7. Michael A. Bender and Martin Farach-Colton. The LCA problem revisited. In Gaston H. Gonnet, Daniel Panario, and Alfredo Viola, editors, Latin American Symposium on Theoretical Informatics, LATIN 2000, volume 1776 of LNCS, pages 88-94. Springer Berlin Heidelberg, 2000. URL: http://dx.doi.org/10.1007/10719839_9.
  8. Kuo Tsai Chen, Ralph Hartzler Fox, and Roger Conant Lyndon. Free differential calculus, IV. The quotient groups of the lower central series. Ann. Math., 68(1):81-95, 1958. URL: http://dx.doi.org/10.2307/1970044.
  9. Graham Cormode and S. Muthukrishnan. Substring compression problems. In 16th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2005, pages 321-330. SIAM, 2005. URL: http://dl.acm.org/citation.cfm?id=1070432.1070478.
  10. Maxime Crochemore, Christophe Hancart, and Thierry Lecroq. Algorithms on Strings. Cambridge University Press, New York, NY, USA, 2007. Google Scholar
  11. Maxime Crochemore, Costas S. Iliopoulos, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Extracting powers and periods in a word from its runs structure. Theor. Comput. Sci., 521:29-41, 2014. URL: http://dx.doi.org/10.1016/j.tcs.2013.11.018.
  12. Maxime Crochemore, Roman Kolpakov, and Gregory Kucherov. Optimal bounds for computing α-gapped repeats. In Adrian-Horia Dediu, Jan Janousek, Carlos Martín-Vide, and Bianca Truthe, editors, Language and Automata Theory and Applications, LATA 2016, volume 9618 of LNCS, pages 245-255. Springer, 2016. URL: http://dx.doi.org/10.1007/978-3-319-30000-9_19.
  13. Maxime Crochemore and Dominique Perrin. Two-way string-matching. J. ACM, 38(3):650-674, July 1991. URL: http://dx.doi.org/10.1145/116825.116845.
  14. Jean-Pierre Duval. Factorizing words over an ordered alphabet. J. Algorithms, 4(4):363-381, 1983. URL: http://dx.doi.org/10.1016/0196-6774(83)90017-2.
  15. Michael L. Fredman and Dan E. Willard. Surpassing the information theoretic bound with fusion trees. J. Comput. Syst. Sci., 47(3):424-436, 1993. URL: http://dx.doi.org/10.1016/0022-0000(93)90040-4.
  16. Paweł Gawrychowski, Tomohiro I, Shunsuke Inenaga, Dominik Köppl, and Florin Manea. Efficiently finding all maximal α-gapped repeats. In Nicolas Ollinger and Heribert Vollmer, editors, Symposium on Theoretical Aspects of Computer Science, STACS 2016, volume 47 of LIPIcs, pages 39:1-39:14. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2016. URL: http://dx.doi.org/10.4230/LIPIcs.STACS.2016.39.
  17. Torben Hagerup. Sorting and searching on the word RAM. In Michel Morvan, Christoph Meinel, and Daniel Krob, editors, Symposium on Theoretical Aspects of Computer Science, STACS 1998, volume 1373 of LNCS, pages 366-398. Springer, Berlin Heidelberg, 1998. URL: http://dx.doi.org/10.1007/BFb0028575.
  18. Dov Harel and Robert Endre Tarjan. Fast algorithms for finding nearest common ancestors. SIAM J. Comput., 13(2):338-355, 1984. URL: http://dx.doi.org/10.1137/0213024.
  19. Tomohiro I, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Efficient Lyndon factorization of grammar compressed text. In Johannes Fischer and Peter Sanders, editors, Combinatorial Pattern Matching, CPM 2013, volume 7922 of LNCS, pages 153-164. Springer, 2013. URL: http://dx.doi.org/10.1007/978-3-642-38905-4.
  20. Tomohiro I, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Faster Lyndon factorization algorithms for SLP and LZ78 compressed text. In Oren Kurland, Moshe Lewenstein, and Ely Porat, editors, String Processing and Information Retrieval, SPIRE 2013, volume 8214 of LNCS, pages 174-185. Springer International Publishing Switzerland, 2013. URL: http://dx.doi.org/10.1007/978-3-319-02432-5_21.
  21. Richard M. Karp and Michael O. Rabin. Efficient randomized pattern-matching algorithms. IBM J. Res. Dev., 31(2):249-260, 1987. URL: http://dx.doi.org/10.1147/rd.312.0249.
  22. Orgad Keller, Tsvi Kopelowitz, Shir Landau Feibish, and Moshe Lewenstein. Generalized substring compression. Theor. Comput. Sci., 525:45-54, 2014. URL: http://dx.doi.org/10.1016/j.tcs.2013.10.010.
  23. Jinil Kim, Peter Eades, Rudolf Fleischer, Seok-Hee Hong, Costas S. Iliopoulos, Kunsoo Park, Simon J. Puglisi, and Takeshi Tokuyama. Order-preserving matching. Theor. Comput. Sci., 525:68-79, 2014. URL: http://dx.doi.org/10.1016/j.tcs.2013.10.006.
  24. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Internal pattern matching queries in a text and applications. In Piotr Indyk, editor, 26th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, pages 532-551. SIAM, 2015. URL: http://dx.doi.org/10.1137/1.9781611973730.36.
  25. Marcin Kubica, Tomasz Kulczyński, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. A linear time algorithm for consecutive permutation pattern matching. Inf. Process. Lett., 113(12):430-433, 2013. URL: http://dx.doi.org/10.1016/j.ipl.2013.03.015.
  26. Roger Conant Lyndon. On Burnside’s problem. T. Am. Math. Soc., 77(2):202-215, 1954. URL: http://dx.doi.org/10.1090/S0002-9947-1954-0064049-X.
  27. Udi Manber and Eugene W. Myers. Suffix arrays: A new method for on-line string searches. SIAM J. Comput., 22(5):935-948, 1993. URL: http://dx.doi.org/10.1137/0222058.
  28. Marcin Mucha. Lyndon words and short superstrings. In Sanjeev Khanna, editor, 24th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2013, pages 958-972. SIAM, 2013. URL: http://dx.doi.org/10.1137/1.9781611973105.
  29. Mihai Pătraşcu and Mikkel Thorup. Dynamic integer sets with optimal rank, select, and predecessor search. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, pages 166-175. IEEE Computer Society, 2014. URL: http://dx.doi.org/10.1109/FOCS.2014.26.