Computing Runs on a Trie

Authors Ryo Sugahara, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai , Masayuki Takeda



PDF
Thumbnail PDF

File

LIPIcs.CPM.2019.23.pdf
  • Filesize: 0.52 MB
  • 11 pages

Document Identifiers

Author Details

Ryo Sugahara
  • Department of Informatics, Kyushu University, Japan
Yuto Nakashima
  • Department of Informatics, Kyushu University, Japan
Shunsuke Inenaga
  • Department of Informatics, Kyushu University, Japan
Hideo Bannai
  • Department of Informatics, Kyushu University, Japan
Masayuki Takeda
  • Department of Informatics, Kyushu University, Japan

Cite AsGet BibTex

Ryo Sugahara, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing Runs on a Trie. In 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 128, pp. 23:1-23:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.CPM.2019.23

Abstract

A maximal repetition, or run, in a string, is a maximal periodic substring whose smallest period is at most half the length of the substring. In this paper, we consider runs that correspond to a path on a trie, or in other words, on a rooted edge-labeled tree where the endpoints of the path must be a descendant/ancestor of the other. For a trie with n edges, we show that the number of runs is less than n. We also show an O(n sqrt{log n}log log n) time and O(n) space algorithm for counting and finding the shallower endpoint of all runs. We further show an O(n log n) time and O(n) space algorithm for finding both endpoints of all runs. We also discuss how to improve the running time even more.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Combinatorial algorithms
  • Mathematics of computing → Combinatorics on words
Keywords
  • runs
  • Lyndon words

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Hideo Bannai, Tomohiro I, Shunsuke Inenaga, Yuto Nakashima, Masayuki Takeda, and Kazuya Tsuruta. The "Runs" Theorem. SIAM J. Comput., 46(5):1501-1514, 2017. URL: http://dx.doi.org/10.1137/15M1011032.
  2. Djamal Belazzougui and Simon J. Puglisi. Range Predecessor and Lempel-Ziv Parsing. In Robert Krauthgamer, editor, Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 2053-2071. SIAM, 2016. URL: http://dx.doi.org/10.1137/1.9781611974331.ch143.
  3. Michael A. Bender and Martin Farach-Colton. The LCA Problem Revisited. In Gaston H. Gonnet, Daniel Panario, and Alfredo Viola, editors, LATIN 2000: Theoretical Informatics, 4th Latin American Symposium, Punta del Este, Uruguay, April 10-14, 2000, Proceedings, volume 1776 of Lecture Notes in Computer Science, pages 88-94. Springer, 2000. URL: http://dx.doi.org/10.1007/10719839_9.
  4. Michael A. Bender and Martin Farach-Colton. The Level Ancestor Problem simplified. Theor. Comput. Sci., 321(1):5-12, 2004. URL: http://dx.doi.org/10.1016/j.tcs.2003.05.002.
  5. Philip Bille, Pawel Gawrychowski, Inge Li Gørtz, Gad M. Landau, and Oren Weimann. Longest common extensions in trees. Theor. Comput. Sci., 638:98-107, 2016. URL: http://dx.doi.org/10.1016/j.tcs.2015.08.009.
  6. Dany Breslauer. Efficient String Algorithmics. PhD thesis, Columbia University, 1992. Google Scholar
  7. Dany Breslauer. The Suffix Tree of a Tree and Minimizing Sequential Transducers. Theor. Comput. Sci., 191(1-2):131-144, 1998. URL: http://dx.doi.org/10.1016/S0304-3975(96)00319-2.
  8. Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Wojciech Tyczynski, and Tomasz Walen. The Maximum Number of Squares in a Tree. In Juha Kärkkäinen and Jens Stoye, editors, Combinatorial Pattern Matching - 23rd Annual Symposium, CPM 2012, Helsinki, Finland, July 3-5, 2012. Proceedings, volume 7354 of Lecture Notes in Computer Science, pages 27-40. Springer, 2012. URL: http://dx.doi.org/10.1007/978-3-642-31265-6_3.
  9. Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Ritu Kundu, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Near-Optimal Computation of Runs over General Alphabet via Non-Crossing LCE Queries. In Shunsuke Inenaga, Kunihiko Sadakane, and Tetsuya Sakai, editors, String Processing and Information Retrieval - 23rd International Symposium, SPIRE 2016, Beppu, Japan, October 18-20, 2016, Proceedings, volume 9954 of Lecture Notes in Computer Science, pages 22-34, 2016. URL: http://dx.doi.org/10.1007/978-3-319-46049-9_3.
  10. Maxime Crochemore, Costas S. Iliopoulos, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. The maximal number of cubic runs in a word. J. Comput. Syst. Sci., 78(6):1828-1836, 2012. URL: http://dx.doi.org/10.1016/j.jcss.2011.12.005.
  11. Maxime Crochemore, Costas S. Iliopoulos, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Extracting powers and periods in a word from its runs structure. Theor. Comput. Sci., 521:29-41, 2014. URL: http://dx.doi.org/10.1016/j.tcs.2013.11.018.
  12. Maxime Crochemore and Wojciech Rytter. Sqares, Cubes, and Time-Space Efficient String Searching. Algorithmica, 13(5):405-425, 1995. URL: http://dx.doi.org/10.1007/BF01190846.
  13. Johannes Fischer and Volker Heun. Theoretical and Practical Improvements on the RMQ-Problem, with Applications to LCA and LCE. In Moshe Lewenstein and Gabriel Valiente, editors, Combinatorial Pattern Matching, 17th Annual Symposium, CPM 2006, Barcelona, Spain, July 5-7, 2006, Proceedings, volume 4009 of Lecture Notes in Computer Science, pages 36-48. Springer, 2006. URL: http://dx.doi.org/10.1007/11780441_5.
  14. Harold N Gabow and Robert Endre Tarjan. A linear-time algorithm for a special case of disjoint set union. Journal of Computer and System Sciences, 30(2):209-221, 1985. Google Scholar
  15. Pawel Gawrychowski, Tomasz Kociumaka, Wojciech Rytter, and Tomasz Walen. Faster Longest Common Extension Queries in Strings over General Alphabets. In Roberto Grossi and Moshe Lewenstein, editors, 27th Annual Symposium on Combinatorial Pattern Matching, CPM 2016, June 27-29, 2016, Tel Aviv, Israel, volume 54 of LIPIcs, pages 5:1-5:13. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016. URL: http://dx.doi.org/10.4230/LIPIcs.CPM.2016.5.
  16. Tomasz Kociumaka, Jakub Pachocki, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Efficient counting of square substrings in a tree. Theoretical Computer Science, 544:60-73, 2014. Google Scholar
  17. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. String Powers in Trees. Algorithmica, 79(3):814-834, 2017. URL: http://dx.doi.org/10.1007/s00453-016-0271-3.
  18. Roman M. Kolpakov and Gregory Kucherov. Finding Maximal Repetitions in a Word in Linear Time. In 40th Annual Symposium on Foundations of Computer Science, FOCS '99, 17-18 October, 1999, New York, NY, USA, pages 596-604. IEEE Computer Society, 1999. URL: http://dx.doi.org/10.1109/SFFCS.1999.814634.
  19. Dmitry Kosolobov. Computing runs on a general alphabet. Inf. Process. Lett., 116(3):241-244, 2016. URL: http://dx.doi.org/10.1016/j.ipl.2015.11.016.
  20. M. Lothaire. Periodic Structures in Words, page 430–477. Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2005. URL: http://dx.doi.org/10.1017/CBO9781107341005.009.
  21. Milan Ruzic. Constructing Efficient Dictionaries in Close to Sorting Time. In Luca Aceto, Ivan Damgård, Leslie Ann Goldberg, Magnús M. Halldórsson, Anna Ingólfsdóttir, and Igor Walukiewicz, editors, Automata, Languages and Programming, 35th International Colloquium, ICALP 2008, Reykjavik, Iceland, July 7-11, 2008, Proceedings, Part I: Tack A: Algorithms, Automata, Complexity, and Games, volume 5125 of Lecture Notes in Computer Science, pages 84-95. Springer, 2008. URL: http://dx.doi.org/10.1007/978-3-540-70575-8_8.
  22. Tetsuo Shibuya. Constructing the Suffix Tree of a Tree with a Large Alphabet. In Alok Aggarwal and C. Pandu Rangan, editors, Algorithms and Computation, 10th International Symposium, ISAAC '99, Chennai, India, December 16-18, 1999, Proceedings, volume 1741 of Lecture Notes in Computer Science, pages 225-236. Springer, 1999. URL: http://dx.doi.org/10.1007/3-540-46632-0_24.
  23. Ryo Sugahara, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing runs on a trie. CoRR, abs/1901.10633, 2019. URL: http://arxiv.org/abs/1901.10633.
  24. Wikipedia. All nearest smaller values - Wikipedia, the free encyclopedia. http://en.wikipedia.org/w/index.php?title=All%20nearest%20smaller%20values&oldid=882430084, 2019. [Online; accessed 29-March-2019].
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail