License
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.MFCS.2017.10
URN: urn:nbn:de:0030-drops-81021
URL: http://drops.dagstuhl.de/opus/volltexte/2017/8102/
Go to the corresponding LIPIcs Volume Portal


Tanimura, Yuka ; Nishimoto, Takaaki ; Bannai, Hideo ; Inenaga, Shunsuke ; Takeda, Masayuki

Small-Space LCE Data Structure with Constant-Time Queries

pdf-format:
LIPIcs-MFCS-2017-10.pdf (0.7 MB)


Abstract

The longest common extension (LCE) problem is to preprocess a given string w of length n so that the length of the longest common prefix between suffixes of w that start at any two given positions is answered quickly. In this paper, we present a data structure of O(z \tau^2 + \frac{n}{\tau}) words of space which answers LCE queries in O(1) time and can be built in O(n \log \sigma) time, where 1 \leq \tau \leq \sqrt{n} is a parameter, z is the size of the Lempel-Ziv 77 factorization of w and \sigma is the alphabet size. The proposed LCE data structure not access the input string w when answering queries, and thus w can be deleted after preprocessing. On top of this main result, we obtain further results using (variants of) our LCE data structure, which include the following: - For highly repetitive strings where the z\tau^2 term is dominated by \frac{n}{\tau}, we obtain a constant-time and sub-linear space LCE query data structure. - Even when the input string is not well compressible via Lempel-Ziv 77 factorization, we still can obtain a constant-time and sub-linear space LCE data structure for suitable \tau and for \sigma \leq 2^{o(\log n)}. - The time-space trade-off lower bounds for the LCE problem by Bille et al. [J. Discrete Algorithms, 25:42-50, 2014] and by Kosolobov [CoRR, abs/1611.02891, 2016] do not apply in some cases with our LCE data structure.

BibTeX - Entry

@InProceedings{tanimura_et_al:LIPIcs:2017:8102,
  author =	{Yuka Tanimura and Takaaki Nishimoto and Hideo Bannai and Shunsuke Inenaga and Masayuki Takeda},
  title =	{{Small-Space LCE Data Structure with Constant-Time Queries}},
  booktitle =	{42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)},
  pages =	{10:1--10:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-046-0},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{83},
  editor =	{Kim G. Larsen and Hans L. Bodlaender and Jean-Francois Raskin},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2017/8102},
  URN =		{urn:nbn:de:0030-drops-81021},
  doi =		{10.4230/LIPIcs.MFCS.2017.10},
  annote =	{Keywords: longest common extension, truncated suffix trees, t-covers}
}

Keywords: longest common extension, truncated suffix trees, t-covers
Seminar: 42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)
Issue Date: 2017
Date of publication: 22.11.2017


DROPS-Home | Fulltext Search | Imprint Published by LZI