eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2024-06-18
20:1
20:18
10.4230/LIPIcs.CPM.2024.20
article
Construction of Sparse Suffix Trees and LCE Indexes in Optimal Time and Space
Kosolobov, Dmitry
1
https://orcid.org/0000-0002-2909-2952
Sivukhin, Nikita
1
https://orcid.org/0000-0003-4995-6954
Ural Federal University, Ekaterinburg, Russia
The notions of synchronizing and partitioning sets are recently introduced variants of locally consistent parsings with a great potential in problem-solving. In this paper we propose a deterministic algorithm that constructs for a given readonly string of length n over the alphabet {0,1,…,n^{𝒪(1)}} a variant of a τ-partitioning set with size 𝒪(b) and τ = n/b using 𝒪(b) space and 𝒪(1/(ε)n) time provided b ≥ n^ε, for ε > 0. As a corollary, for b ≥ n^ε and constant ε > 0, we obtain linear time construction algorithms with 𝒪(b) space on top of the string for two major small-space indexes: a sparse suffix tree, which is a compacted trie built on b chosen suffixes of the string, and a longest common extension (LCE) index, which occupies 𝒪(b) space and allows us to compute the longest common prefix for any pair of substrings in 𝒪(n/b) time. For both, the 𝒪(b) construction storage is asymptotically optimal since the tree itself takes 𝒪(b) space and any LCE index with 𝒪(n/b) query time must occupy at least 𝒪(b) space by a known trade-off (at least for b ≥ Ω(n / log n)). In case of arbitrary b ≥ Ω(log² n), we present construction algorithms for the partitioning set, sparse suffix tree, and LCE index with 𝒪(nlog_b n) running time and 𝒪(b) space, thus also improving the state of the art.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol296-cpm2024/LIPIcs.CPM.2024.20/LIPIcs.CPM.2024.20.pdf
(τ,δ)-partitioning set
longest common extension
sparse suffix tree