Reduction Ratio of the IS-Algorithm: Worst and Random Cases

Author Vincent Jugé



PDF
Thumbnail PDF

File

LIPIcs.CPM.2022.8.pdf
  • Filesize: 0.81 MB
  • 23 pages

Document Identifiers

Author Details

Vincent Jugé
  • LIGM, CNRS, Univ Gustave Eiffel, F77454 Marne-la-Vallée, France

Cite AsGet BibTex

Vincent Jugé. Reduction Ratio of the IS-Algorithm: Worst and Random Cases. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 8:1-8:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.CPM.2022.8

Abstract

We study the IS-algorithm, a well-known linear-time algorithm for computing the suffix array of a word. This algorithm relies on transforming the input word w into another word, called the reduced word of w, that will be at least twice shorter; then, the algorithm recursively computes the suffix array of the reduced word. In this article, we study the reduction ratio of the IS-algorithm, i.e., the ratio between the lengths of the input word and the word obtained after reducing k times the input word. We investigate both worst cases, in which we find precise results, and random cases, where we prove some strong convergence phenomena. Finally, we prove that, if the input word is a randomly chosen word of length n, we should not expect much more than log(log(n)) recursive function calls.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • Word combinatorics
  • Suffix array
  • IS algorithm

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Mohamed Ibrahim Abouelhoda, Stefan Kurtz, and Enno Ohlebusch. Replacing suffix trees with enhanced suffix arrays. Journal of discrete algorithms, 2(1):53-86, 2004. Google Scholar
  2. Timo Bingmann, Johannes Fischer, and Vitaly Osipov. Inducing suffix and LCP arrays in external memory. Journal of Experimental Algorithmics (JEA), 21:1-27, 2016. Google Scholar
  3. Maxime Crochemore, Lucian Ilie, and William F Smyth. A simple algorithm for computing the Lempel Ziv factorization. In Data Compression Conference (DCC 2008), pages 482-488. IEEE, 2008. Google Scholar
  4. Juha Kärkkäinen, Dominik Kempa, Simon J Puglisi, and Bella Zhukova. Engineering external memory induced suffix sorting. In 2017 Proceedings of the Ninteenth Workshop on Algorithm Engineering and Experiments (ALENEX), pages 98-108. SIAM, 2017. Google Scholar
  5. Juha Kärkkäinen and Peter Sanders. Simple linear work suffix array construction. In International Colloquium on Automata, Languages, and Programming (ICALP), pages 943-955. Springer, 2003. Google Scholar
  6. Dong Kyue Kim, Jeong Seop Sim, Heejin Park, and Kunsoo Park. Linear-time construction of suffix arrays. In Annual Symposium on Combinatorial Pattern Matching (CPM), pages 186-199. Springer, 2003. Google Scholar
  7. Pang Ko and Srinivas Aluru. Space efficient linear time construction of suffix arrays. Journal of Discrete Algorithms, 3(2-4):143-156, 2005. Google Scholar
  8. David Levin and Yuval Peres. Markov chains and mixing times, volume 107. American Mathematical Society, 2017. Google Scholar
  9. Udi Manber and Gene Myers. Suffix arrays: a new method for on-line string searches. SIAM Journal on Computing, 22(5):935-948, 1993. Google Scholar
  10. Maxim Mozgovoy, Kimmo Fredriksson, Daniel White, Mike Joy, and Erkki Sutinen. Fast plagiarism detection system. In International Symposium on String Processing and Information Retrieval (SPIRE), pages 267-270. Springer, 2005. Google Scholar
  11. Cyril Nicaud. A probabilistic analysis of the reduction ratio in the suffix-array IS-algorithm. In Annual Symposium on Combinatorial Pattern Matching (CPM), pages 374-384. Springer, 2015. Google Scholar
  12. Ge Nong, Sen Zhang, and Wai Hong Chan. Two efficient algorithms for linear time suffix array construction. IEEE Transactions on Computers, 60(10):1471-1484, 2010. Google Scholar
  13. Ursula Porod. Dynamics of Markov chains for undergraduates, 2021. URL: https://www.math.northwestern.edu/documents/book-markov-chains.pdf.
  14. Simon J Puglisi, William F Smyth, and Andrew H Turpin. A taxonomy of suffix array construction algorithms. ACM Computing Surveys (CSUR), 39(2):4-es, 2007. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail