On Maximal Repeats in Compressed Strings

Author Julian Pape-Lange



PDF
Thumbnail PDF

File

LIPIcs.CPM.2019.18.pdf
  • Filesize: 498 kB
  • 13 pages

Document Identifiers

Author Details

Julian Pape-Lange
  • Technische Universität Chemnitz, Straße der Nationen 62, 09111 Chemnitz, Germany

Acknowledgements

I thank Professor Laurent Bartholdi for leading me to this research topic as well as for his helpful and inspiring pieces of advice.

Cite AsGet BibTex

Julian Pape-Lange. On Maximal Repeats in Compressed Strings. In 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 128, pp. 18:1-18:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.CPM.2019.18

Abstract

This paper presents and proves a new non-trivial upper bound on the number of maximal repeats of compressed strings. Using Theorem 1 of Raffinot’s article "On Maximal Repeats in Strings", this upper bound can be directly translated into an upper bound on the number of nodes in the Compacted Directed Acyclic Word Graphs of compressed strings. More formally, this paper proves that the number of maximal repeats in a string with z (self-referential) LZ77-factors and without q-th powers is at most 3q(z+1)^3-2. Also, this paper proves that for 2000 <= z <= q this upper bound is tight up to a constant factor.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Combinatorics on words
  • Mathematics of computing → Combinatoric problems
Keywords
  • Maximal repeats
  • Combinatorics on compressed strings
  • LZ77
  • Compact suffix automata
  • CDAWGs

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Djamal Belazzougui, Fabio Cunial, Travis Gagie, Nicola Prezza, and Mathieu Raffinot. Composite Repetition-Aware Data Structures. In Ferdinando Cicalese, Ely Porat, and Ugo Vaccaro, editors, Combinatorial Pattern Matching - 26th Annual Symposium, CPM 2015, Ischia Island, Italy, June 29 - July 1, 2015, Proceedings, volume 9133 of Lecture Notes in Computer Science, pages 26-39. Springer, 2015. URL: http://dx.doi.org/10.1007/978-3-319-19929-0_3.
  2. Anselm Blumer, J. Blumer, David Haussler, Ross M. McConnell, and Andrzej Ehrenfeucht. Complete inverted files for efficient text retrieval and analysis. J. ACM, 34(3):578-595, 1987. URL: http://dx.doi.org/10.1145/28869.28873.
  3. Anselm Blumer, Andrzej Ehrenfeucht, and David Haussler. Average sizes of suffix trees and DAWGs. Discrete Applied Mathematics, 24(1-3):37-45, 1989. URL: http://dx.doi.org/10.1016/0166-218X(92)90270-K.
  4. Moses Charikar, Eric Lehman, Ding Liu, Rina Panigrahy, Manoj Prabhakaran, Amit Sahai, and Abhi Shelat. The smallest grammar problem. IEEE Trans. Information Theory, 51(7):2554-2576, 2005. URL: http://dx.doi.org/10.1109/TIT.2005.850116.
  5. Maxime Crochemore and Wojciech Rytter. Text Algorithms. Oxford University Press, 1994. URL: http://www-igm.univ-mlv.fr/%7Emac/REC/B1.html.
  6. Chiara Epifanio, Filippo Mignosi, Jeffrey Shallit, and Ilaria Venturini. On Sturmian graphs. Discrete Applied Mathematics, 155(8):1014-1030, 2007. URL: http://dx.doi.org/10.1016/j.dam.2006.11.003.
  7. N. J. Fine and H. S. Wilf. Uniqueness Theorems for Periodic Functions. Proceedings of the American Mathematical Society, 16(1):109-114, 1965. URL: http://www.jstor.org/stable/2034009.
  8. Isamu Furuya, Takuya Takagi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Takuya Kida. MR-RePair: Grammar Compression based on Maximal Repeats. CoRR, abs/1811.04596, 2018. URL: http://arxiv.org/abs/1811.04596.
  9. Dan Gusfield. Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology. Cambridge University Press, 1997. Google Scholar
  10. Jacques Nicolas, Christine Rousseau, Anne Siegel, Pierre Peterlongo, François Coste, Patrick Durand, Sebastien Tempel, Anne-Sophie Valin, and Frédéric Mahé. Local and Maximal Repeats. URL: https://www.researchgate.net/publication/228940275_Local_and_Maximal_Repeats.
  11. Jakub Radoszewski and Wojciech Rytter. On the structure of compacted subword graphs of Thue-Morse words and their applications. J. Discrete Algorithms, 11:15-24, 2012. URL: http://dx.doi.org/10.1016/j.jda.2011.01.001.
  12. Mathieu Raffinot. On maximal repeats in strings. Inf. Process. Lett., 80(3):165-169, 2001. URL: http://dx.doi.org/10.1016/S0020-0190(01)00152-1.
  13. Yuka Tanimura, Takaaki Nishimoto, Hideo Bannai, Shunsuke Inenaga, and Masayuki Takeda. Small-Space LCE Data Structure with Constant-Time Queries. In Kim G. Larsen, Hans L. Bodlaender, and Jean-François Raskin, editors, 42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017, August 21-25, 2017 - Aalborg, Denmark, volume 83 of LIPIcs, pages 10:1-10:15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2017. URL: http://dx.doi.org/10.4230/LIPIcs.MFCS.2017.10.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail