Tanimura, Yuka ;
Nishimoto, Takaaki ;
Bannai, Hideo ;
Inenaga, Shunsuke ;
Takeda, Masayuki
SmallSpace LCE Data Structure with ConstantTime Queries
Abstract
The longest common extension (LCE) problem is to preprocess a given string w of length n so that the length of the longest common prefix between suffixes of w that start at any two given positions is answered quickly. In this paper, we present a data structure of O(z \tau^2 + \frac{n}{\tau}) words of space which answers LCE queries in O(1) time and can be built in O(n \log \sigma) time, where 1 \leq \tau \leq \sqrt{n} is a parameter, z is the size of the LempelZiv 77 factorization of w and \sigma is the alphabet size. The proposed LCE data structure not access the input string w when answering queries, and thus w can be deleted after preprocessing. On top of this main result, we obtain further results using (variants of) our LCE data structure, which include the following:
 For highly repetitive strings where the z\tau^2 term is dominated by \frac{n}{\tau}, we obtain a constanttime and sublinear space LCE query data structure.
 Even when the input string is not well compressible via LempelZiv 77 factorization, we still can obtain a constanttime and sublinear space LCE data structure for suitable \tau and for \sigma \leq 2^{o(\log n)}.
 The timespace tradeoff lower bounds for the LCE problem by Bille et al. [J. Discrete Algorithms, 25:4250, 2014] and by Kosolobov [CoRR, abs/1611.02891, 2016] do not apply in some cases with our LCE data structure.
BibTeX  Entry
@InProceedings{tanimura_et_al:LIPIcs:2017:8102,
author = {Yuka Tanimura and Takaaki Nishimoto and Hideo Bannai and Shunsuke Inenaga and Masayuki Takeda},
title = {{SmallSpace LCE Data Structure with ConstantTime Queries}},
booktitle = {42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)},
pages = {10:110:15},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959770460},
ISSN = {18688969},
year = {2017},
volume = {83},
editor = {Kim G. Larsen and Hans L. Bodlaender and JeanFrancois Raskin},
publisher = {Schloss DagstuhlLeibnizZentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2017/8102},
URN = {urn:nbn:de:0030drops81021},
doi = {10.4230/LIPIcs.MFCS.2017.10},
annote = {Keywords: longest common extension, truncated suffix trees, tcovers}
}
01.12.2017
Keywords: 

longest common extension, truncated suffix trees, tcovers 
Seminar: 

42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)

Issue date: 

2017 
Date of publication: 

01.12.2017 