eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2023-08-30
14:1
14:16
10.4230/LIPIcs.ESA.2023.14
article
Lyndon Arrays in Sublinear Time
Bannai, Hideo
1
https://orcid.org/0000-0002-6856-5185
Ellert, Jonas
2
https://orcid.org/0000-0003-3305-6185
M&D Data Science Center, Tokyo Medical and Dental University, Japan
Technical University of Dortmund, Germany
A Lyndon word is a string that is lexicographically smaller than all of its non-trivial suffixes. For example, airbus is a Lyndon word, but amtrak is not a Lyndon word due to its suffix ak. The Lyndon array stores the length of the longest Lyndon prefix of each suffix of a string. For a length-n string over a general ordered alphabet, the array can be computed in O(n) time (Bille et al., ICALP 2020). However, on a word-RAM of word-width w ≥ log₂ n, linear time is not optimal if the string is over integer alphabet {0, … , σ} with σ ≪ n. In this case, the string can be stored in O(n log σ) bits (or O(n / log_σ n) words) of memory, and reading it takes only O(n / log_σ n) time. We show that O(n / log_σ n) time and words of space suffice to compute the succinct 2n-bit version of the Lyndon array. The time is optimal for w = O(log n). The algorithm uses precomputed lookup tables to perform significant parts of the computation in constant time. This is possible due to properties of periodic substrings, which we carefully analyze to achieve the desired result. We envision that the algorithm has applications in the computation of runs (maximal periodic substrings), where the Lyndon array plays a central role in both theoretically and practically fast algorithms.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol274-esa2023/LIPIcs.ESA.2023.14/LIPIcs.ESA.2023.14.pdf
Lyndon forest
Lyndon table
Lyndon array
sublinear time algorithms
word RAM algorithms
word packing
tabulation
lookup tables
periodicity