Document

**Published in:** LIPIcs, Volume 83, 42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)

The longest common extension (LCE) problem is to preprocess a given string w of length n so that the length of the longest common prefix between suffixes of w that start at any two given positions is answered quickly. In this paper, we present a data structure of O(z \tau^2 + \frac{n}{\tau}) words of space which answers LCE queries in O(1) time and can be built in O(n \log \sigma) time, where 1 \leq \tau \leq \sqrt{n} is a parameter, z is the size of the Lempel-Ziv 77 factorization of w and \sigma is the alphabet size. The proposed LCE data structure not access the input string w when answering queries, and thus w can be deleted after preprocessing. On top of this main result, we obtain further results using (variants of) our LCE data structure, which include the following:
- For highly repetitive strings where the z\tau^2 term is dominated by \frac{n}{\tau}, we obtain a constant-time and sub-linear space LCE query data structure.
- Even when the input string is not well compressible via Lempel-Ziv 77 factorization, we still can obtain a constant-time and sub-linear space LCE data structure for suitable \tau and for \sigma \leq 2^{o(\log n)}.
- The time-space trade-off lower bounds for the LCE problem by Bille et al. [J. Discrete Algorithms, 25:42-50, 2014] and by Kosolobov [CoRR, abs/1611.02891, 2016] do not apply in some cases with our LCE data structure.

Yuka Tanimura, Takaaki Nishimoto, Hideo Bannai, Shunsuke Inenaga, and Masayuki Takeda. Small-Space LCE Data Structure with Constant-Time Queries. In 42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 83, pp. 10:1-10:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{tanimura_et_al:LIPIcs.MFCS.2017.10, author = {Tanimura, Yuka and Nishimoto, Takaaki and Bannai, Hideo and Inenaga, Shunsuke and Takeda, Masayuki}, title = {{Small-Space LCE Data Structure with Constant-Time Queries}}, booktitle = {42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)}, pages = {10:1--10:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-046-0}, ISSN = {1868-8969}, year = {2017}, volume = {83}, editor = {Larsen, Kim G. and Bodlaender, Hans L. and Raskin, Jean-Francois}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2017.10}, URN = {urn:nbn:de:0030-drops-81021}, doi = {10.4230/LIPIcs.MFCS.2017.10}, annote = {Keywords: longest common extension, truncated suffix trees, t-covers} }

Document

**Published in:** LIPIcs, Volume 54, 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)

Given a string S of n symbols, a longest common extension query LCE(i,j) asks for the length of the longest common prefix of the $i$th and $j$th suffixes of S. LCE queries have several important applications in string processing, perhaps most notably to suffix sorting. Recently, Bille et al. (J. Discrete Algorithms 25:42-50, 2014, Proc. CPM 2015:65-76) described several data structures for answering LCE queries that offers a space-time trade-off between data structure size and query time. In particular, for a parameter 1 <= tau <= n, their best deterministic solution is a data structure of size O(n/tau) which allows LCE queries to be answered in O(tau) time. However, the construction time for all deterministic versions of their data structure is quadratic in n. In this paper, we propose a deterministic solution that achieves a similar space-time trade-off of O(tau * min(log(tau),log(n/tau)) query time using O(n/tau) space, but significantly improve the construction time to O(n*tau).

Yuka Tanimura, Tomohiro I, Hideo Bannai, Shunsuke Inenaga, Simon J. Puglisi, and Masayuki Takeda. Deterministic Sub-Linear Space LCE Data Structures With Efficient Construction. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 1:1-1:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{tanimura_et_al:LIPIcs.CPM.2016.1, author = {Tanimura, Yuka and I, Tomohiro and Bannai, Hideo and Inenaga, Shunsuke and Puglisi, Simon J. and Takeda, Masayuki}, title = {{Deterministic Sub-Linear Space LCE Data Structures With Efficient Construction}}, booktitle = {27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)}, pages = {1:1--1:10}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-012-5}, ISSN = {1868-8969}, year = {2016}, volume = {54}, editor = {Grossi, Roberto and Lewenstein, Moshe}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2016.1}, URN = {urn:nbn:de:0030-drops-60655}, doi = {10.4230/LIPIcs.CPM.2016.1}, annote = {Keywords: longest common extension, longest common prefix, sparse suffix array} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail