On the Size of Overlapping Lempel-Ziv and Lyndon Factorizations

Authors Yuki Urabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai , Masayuki Takeda



PDF
Thumbnail PDF

File

LIPIcs.CPM.2019.29.pdf
  • Filesize: 0.63 MB
  • 11 pages

Document Identifiers

Author Details

Yuki Urabe
  • Department of Informatics, Kyushu University, Japan
Yuto Nakashima
  • Department of Informatics, Kyushu University, Japan
Shunsuke Inenaga
  • Department of Informatics, Kyushu University, Japan
Hideo Bannai
  • Department of Informatics, Kyushu University, Japan
Masayuki Takeda
  • Department of Informatics, Kyushu University, Japan

Cite As Get BibTex

Yuki Urabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. On the Size of Overlapping Lempel-Ziv and Lyndon Factorizations. In 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 128, pp. 29:1-29:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019) https://doi.org/10.4230/LIPIcs.CPM.2019.29

Abstract

Lempel-Ziv (LZ) factorization and Lyndon factorization are well-known factorizations of strings. Recently, Kärkkäinen et al. studied the relation between the sizes of the two factorizations, and showed that the size of the Lyndon factorization is always smaller than twice the size of the non-overlapping LZ factorization [STACS 2017]. In this paper, we consider a similar problem for the overlapping version of the LZ factorization. Since the size of the overlapping LZ factorization is always smaller than the size of the non-overlapping LZ factorization and, in fact, can even be an O(log n) factor smaller, it is not immediately clear whether a similar bound as in previous work would hold. Nevertheless, in this paper, we prove that the size of the Lyndon factorization is always smaller than four times the size of the overlapping LZ factorization.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Combinatorics on words
Keywords
  • Lyndon factorization
  • Lyndon words
  • Lempel-Ziv factorization

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Golnaz Badkobeh, Hideo Bannai, Keisuke Goto, Tomohiro I, Costas S. Iliopoulos, Shunsuke Inenaga, Simon J. Puglisi, and Shiho Sugimoto. Closed factorization. Discrete Applied Mathematics, 212:23-29, 2016. URL: http://dx.doi.org/10.1016/j.dam.2016.04.009.
  2. Hideo Bannai, Travis Gagie, Shunsuke Inenaga, Juha Kärkkäinen, Dominik Kempa, Marcin Piatkowski, and Shiho Sugimoto. Diverse Palindromic Factorization is NP-Complete. Int. J. Found. Comput. Sci., 29(2):143-164, 2018. URL: http://dx.doi.org/10.1142/S0129054118400014.
  3. Hideo Bannai, Tomohiro I, Shunsuke Inenaga, Yuto Nakashima, Masayuki Takeda, and Kazuya Tsuruta. The "Runs" Theorem. SIAM Journal on Computing, 46(5):1501-1514, 2017. URL: http://dx.doi.org/10.1137/15M1011032.
  4. Kirill Borozdin, Dmitry Kosolobov, Mikhail Rubinchik, and Arseny M. Shur. Palindromic Length in Linear Time. In CPM 2017, pages 23:1-23:12, 2017. URL: http://dx.doi.org/10.4230/LIPIcs.CPM.2017.23.
  5. Gang Chen, Simon J. Puglisi, and W. F. Smyth. Lempel-Ziv factorization using less time & space. Mathematics in Computer Science, 1(4):605-623, June 2008. URL: http://dx.doi.org/10.1007/s11786-007-0024-4.
  6. K. T. Chen, R. H. Fox, and R. C. Lyndon. Free Differential Calculus, IV. The Quotient Groups of the Lower Central Series. Annals of Mathematics, 68(1):81-95, 1958. URL: http://www.jstor.org/stable/1970044.
  7. Maxime Crochemore, Lucian Ilie, and Liviu Tinta. Towards a Solution to the "Runs" Conjecture. In Paolo Ferragina and Gad M. Landau, editors, Combinatorial Pattern Matching, pages 290-302, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg. Google Scholar
  8. Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Ritu Kundu, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Near-Optimal Computation of Runs over General Alphabet via Non-Crossing LCE Queries. In SPIRE 2016, pages 22-34, 2016. URL: http://dx.doi.org/10.1007/978-3-319-46049-9_3.
  9. Marius Dumitran, Florin Manea, and Dirk Nowotka. On Prefix/Suffix-Square Free Words. In SPIRE 2015, pages 54-66, 2015. URL: http://dx.doi.org/10.1007/978-3-319-23826-5_6.
  10. Gabriele Fici, Travis Gagie, Juha Kärkkäinen, and Dominik Kempa. A subquadratic algorithm for minimum palindromic factorization. J. Discrete Algorithms, 28:41-48, 2014. URL: http://dx.doi.org/10.1016/j.jda.2014.08.001.
  11. Pawel Gawrychowski, Tomasz Kociumaka, Wojciech Rytter, and Tomasz Walen. Faster Longest Common Extension Queries in Strings over General Alphabets. In CPM 2016, pages 5:1-5:13, 2016. URL: http://dx.doi.org/10.4230/LIPIcs.CPM.2016.5.
  12. Tomohiro I, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Faster Lyndon factorization algorithms for SLP and LZ78 compressed text. Theor. Comput. Sci., 656:215-224, 2016. URL: http://dx.doi.org/10.1016/j.tcs.2016.03.005.
  13. Tomohiro I, Shiho Sugimoto, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing Palindromic Factorizations and Palindromic Covers On-line. In CPM 2014, pages 150-161, 2014. URL: http://dx.doi.org/10.1007/978-3-319-07566-2_16.
  14. Hiroe Inoue, Yoshiaki Matsuoka, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing Smallest and Largest Repetition Factorizations in O(n log n) Time. In PSC 2016, pages 135-145, 2016. URL: http://www.stringology.org/event/2016/p12.html.
  15. Juha Kärkkäinen, Dominik Kempa, Yuto Nakashima, Simon J. Puglisi, and Arseny M. Shur. On the Size of Lempel-Ziv and Lyndon Factorizations. In STACS 2017, pages 45:1-45:13, 2017. URL: http://dx.doi.org/10.4230/LIPIcs.STACS.2017.45.
  16. Roman Kolpakov and Gregory Kucherov. Finding Maximal Repetitions in a Word in Linear Time. In FOCS 1999, pages 596-604, Washington, DC, USA, 1999. IEEE Computer Society. URL: http://dl.acm.org/citation.cfm?id=795665.796470.
  17. Dmitry Kosolobov. Computing runs on a general alphabet. Inf. Process. Lett., 116(3):241-244, 2016. URL: http://dx.doi.org/10.1016/j.ipl.2015.11.016.
  18. Yoshiaki Matsuoka, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, and Florin Manea. Factorizing a String into Squares in Linear Time. In CPM 2016, pages 27:1-27:12, 2016. URL: http://dx.doi.org/10.4230/LIPIcs.CPM.2016.27.
  19. Wojciech Rytter. Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theoretical Computer Science, 302(1):211-222, 2003. URL: http://dx.doi.org/10.1016/S0304-3975(02)00777-6.
  20. Terry A. Welch. A Technique for High-Performance Data Compression. IEEE Computer, 17(6):8-19, 1984. URL: http://dx.doi.org/10.1109/MC.1984.1659158.
  21. J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3):337-343, May 1977. URL: http://dx.doi.org/10.1109/TIT.1977.1055714.
  22. Jacob Ziv and Abraham Lempel. Compression of individual sequences via variable-rate coding. IEEE Trans. Information Theory, 24(5):530-536, 1978. URL: http://dx.doi.org/10.1109/TIT.1978.1055934.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail