LZ-End Parsing in Linear Time

Authors Dominik Kempa, Dmitry Kosolobov



PDF
Thumbnail PDF

File

LIPIcs.ESA.2017.53.pdf
  • Filesize: 0.5 MB
  • 14 pages

Document Identifiers

Author Details

Dominik Kempa
Dmitry Kosolobov

Cite AsGet BibTex

Dominik Kempa and Dmitry Kosolobov. LZ-End Parsing in Linear Time. In 25th Annual European Symposium on Algorithms (ESA 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 87, pp. 53:1-53:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)
https://doi.org/10.4230/LIPIcs.ESA.2017.53

Abstract

We present a deterministic algorithm that constructs in linear time and space the LZ-End parsing (a variation of LZ77) of a given string over an integer polynomially bounded alphabet.
Keywords
  • LZ-End
  • LZ77
  • construction algorithm
  • linear time

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. P. Beame and F. E. Fich. Optimal bounds for the predecessor problem. In STOC 1999, pages 295-304. ACM, 1999. URL: http://dx.doi.org/10.1006/jcss.2002.1822.
  2. D. Belazzougui and S. J. Puglisi. Range predecessor and Lempel-Ziv parsing. In SODA 2016, pages 2053-2071. SIAM, 2016. URL: http://dx.doi.org/10.1137/1.9781611974331.ch143.
  3. G. E. Blelloch. Space-efficient dynamic orthogonal point location, segment intersection, and range reporting. In SODA 2008, pages 894-903. SIAM, 2008. Google Scholar
  4. F. Claude, A. Fariña, M. A. Martínez-Prieto, and G. Navarro. Universal indexes for highly repetitive document collections. Information Systems, 61:1-23, 2016. URL: http://dx.doi.org/10.1016/j.is.2016.04.002.
  5. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms, third edition. MIT press, 2009. Google Scholar
  6. M. Crochemore and W. Rytter. Jewels of stringology. World Scientific Publishing Co. Pte. Ltd., 2002. Google Scholar
  7. H. Ferrada, T. Gagie, T. Hirvola, and S. J. Puglisi. Hybrid indexes for repetitive datasets. Phil. Trans. R. Soc. A, 372, 2014. URL: http://dx.doi.org/10.1098/rsta.2013.0137.
  8. P. Ferragina and G. Manzini. On compressing the textual web. In WSDM 2010, pages 391-400. ACM, 2010. URL: http://dx.doi.org/10.1145/1718487.1718536.
  9. J. Fischer, T. Gagie, P. Gawrychowski, and T. Kociumaka. Approximating LZ77 via small-space multiple-pattern matching. In ESA 2015, volume 9294 of LNCS, pages 533-544. Springer, 2015. URL: http://dx.doi.org/10.1007/978-3-662-48350-3_45.
  10. J. Fischer and V. Heun. Theoretical and practical improvements on the RMQ-problem, with applications to LCA and LCE. In CPM 2006, volume 4009 of LNCS, pages 36-48. Springer, 2006. URL: http://dx.doi.org/10.1007/11780441_5.
  11. M. L. Fredman and D. E. Willard. Surpassing the information theoretic bound with fusion trees. Journal of Computer and System Sciences, 47(3):424-436, 1993. URL: http://dx.doi.org/10.1016/0022-0000(93)90040-4.
  12. H. N. Gabow and R. E. Tarjan. A linear-time algorithm for a special case of disjoint set union. In STOC 1983, pages 246-251. ACM, 1983. URL: http://dx.doi.org/10.1145/800061.808753.
  13. T. Gagie, P. Gawrychowski, J. Kärkkäinen, Y. Nekrich, and S. J. Puglisi. A faster grammar-based self-index. In LATA 2012, volume 7183 of LNCS, pages 240-251. Springer, 2012. URL: http://dx.doi.org/10.1007/978-3-642-28332-1_21.
  14. T. Gagie, P. Gawrychowski, J. Kärkkäinen, Y. Nekrich, and S. J. Puglisi. LZ77-based self-indexing with faster pattern matching. In LATIN 2014, volume 8392 of LNCS, pages 731-742. Springer, 2014. URL: http://dx.doi.org/10.1007/978-3-642-54423-1_63.
  15. T. Gagie, P Gawrychowski, and S. J. Puglisi. Faster approximate pattern matching in compressed repetitive texts. In ISAAC 2011, volume 7074 of LNCS, pages 653-662. Springer, 2011. URL: http://dx.doi.org/10.1007/978-3-642-25591-5_67.
  16. P. Gawrychowski, M. Lewenstein, and P. K. Nicholson. Weighted ancestors in suffix trees. In ESA 2014, volume 8737 of LNCS, pages 455-466. Springer, 2014. URL: http://dx.doi.org/10.1007/978-3-662-44777-2_38.
  17. J. Kärkkäinen, D. Kempa, and S. J. Puglisi. Linear time Lempel-Ziv factorization: Simple, fast, small. In CPM 2013, volume 7922 of LNCS, pages 189-200. Springer, 2013. URL: http://dx.doi.org/10.1007/978-3-642-38905-4_19.
  18. J. Kärkkäinen, D. Kempa, and S. J Puglisi. Lempel-Ziv parsing in external memory. In DCC 2014, pages 153-162. IEEE, 2014. URL: http://dx.doi.org/10.1109/DCC.2014.78.
  19. D. Kempa and D. Kosolobov. LZ-End parsing in compressed space. In DCC 2017, pages 350-359. IEEE, 2017. URL: http://dx.doi.org/10.1109/DCC.2017.73.
  20. T. Kopelowitz and M. Lewenstein. Dynamic weighted ancestors. In SODA 2007, pages 565-574. SIAM, 2007. Google Scholar
  21. D. Kosolobov. Faster lightweight Lempel-Ziv parsing. In MFCS 2015, volume 9235 of LNCS, pages 432-444, 2015. URL: http://dx.doi.org/10.1007/978-3-662-48054-0_36.
  22. S. Kreft and G. Navarro. LZ77-like compression with fast random access. In DCC 2010, pages 239-248. IEEE, 2010. URL: http://dx.doi.org/10.1109/DCC.2010.29.
  23. S. Kreft and G. Navarro. Self-indexing based on LZ77. In CPM 2011, volume 6661 of LNCS, pages 41-54. Springer, 2011. URL: http://dx.doi.org/10.1007/978-3-642-21458-5_6.
  24. S. Kreft and G. Navarro. On compressing and indexing repetitive sequences. Theoretical Computer Science, 483:115-133, 2013. URL: http://dx.doi.org/10.1016/j.tcs.2012.02.006.
  25. V. Mäkinen and G. Navarro. Compressed full-text indexes. ACM Computing Surveys (CSUR), 39(1):2, 2007. URL: http://dx.doi.org/10.1145/1216370.1216372.
  26. J. I. Munro, G. Navarro, and Y. Nekrich. Space-efficient construction of compressed indexes in deterministic linear time. In SODA 2017, pages 408-424. SIAM, 2017. URL: http://dx.doi.org/10.1137/1.9781611974782.26.
  27. G. Navarro. Indexing text using the Ziv-Lempel trie. J. Discrete Algorithms, 2(1):87-114, 2004. URL: http://dx.doi.org/10.1016/S1570-8667(03)00066-2.
  28. M. Pătraşcu and M. Thorup. Time-space trade-offs for predecessor search. In STOC 2006, pages 232-240. ACM, 2006. URL: http://dx.doi.org/10.1145/1132516.1132551.
  29. M. Pătraşcu and M. Thorup. Dynamic integer sets with optimal rank, select, and predecessor search. In FOCS 2014, pages 166-175. IEEE, 2014. URL: http://dx.doi.org/10.1109/FOCS.2014.26.
  30. A. Policriti and N. Prezza. Computing LZ77 in run-compressed space. In DCC 2016, pages 23-32. IEEE, 2016. URL: http://dx.doi.org/10.1109/DCC.2016.30.
  31. J. Sirén, N. Välimäki, V. Mäkinen, and G. Navarro. Run-length compressed indexes are superior for highly repetitive sequence collections. In SPIRE 2008, volume 5280 of LNCS, pages 164-175. Springer, 2008. URL: http://dx.doi.org/10.1007/978-3-540-89097-3_17.
  32. D. D. Sleator and R. E. Tarjan. A data structure for dynamic trees. Journal of Computer and System Sciences, 26(3):362-391, 1983. URL: http://dx.doi.org/10.1016/0022-0000(83)90006-5.
  33. P. van Emde Boas, R. Kaas, and E. Zijlstra. Design and implementation of an efficient priority queue. Mathematical systems theory, 10(1):99-127, 1976. URL: http://dx.doi.org/10.1007/BF01683268.
  34. J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Trans. Information Theory, 23(3):337-343, 1977. URL: http://dx.doi.org/10.1109/TIT.1977.1055714.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail