Bannai, Hideo ;
Gagie, Travis ;
I, Tomohiro
Online LZ77 Parsing and Matching Statistics with RLBWTs
Abstract
LempelZiv 1977 (LZ77) parsing, matching statistics and the BurrowsWheeler Transform (BWT) are all fundamental elements of stringology. In a series of recent papers, Policriti and Prezza (DCC 2016 and Algorithmica, CPM 2017) showed how we can use an augmented runlength compressed BWT (RLBWT) of the reverse T^R of a text T, to compute offline the LZ77 parse of T in O(n log r) time and O(r) space, where n is the length of T and r is the number of runs in the BWT of T^R. In this paper we first extend a wellknown technique for updating an unaugmented RLBWT when a character is prepended to a text, to work with Policriti and Prezza's augmented RLBWT. This immediately implies that we can build online the LZ77 parse of T while still using O(n log r) time and O(r) space; it also seems likely to be of independent interest. Our experiments, using an extension of Ohno, Takabatake, I and Sakamoto's (IWOCA 2017) implementation of updating, show our approach is both time and spaceefficient for repetitive strings. We then show how to augment the RLBWT further  albeit making it static again and increasing its space by a factor proportional to the size of the alphabet  such that later, given another string S and O(log log n)time random access to T, we can compute the matching statistics of S with respect to T in O(S log log n) time.
BibTeX  Entry
@InProceedings{bannai_et_al:LIPIcs:2018:8702,
author = {Hideo Bannai and Travis Gagie and Tomohiro I},
title = {{Online LZ77 Parsing and Matching Statistics with RLBWTs}},
booktitle = {Annual Symposium on Combinatorial Pattern Matching (CPM 2018)},
pages = {7:17:12},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959770743},
ISSN = {18688969},
year = {2018},
volume = {105},
editor = {Gonzalo Navarro and David Sankoff and Binhai Zhu},
publisher = {Schloss DagstuhlLeibnizZentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2018/8702},
URN = {urn:nbn:de:0030drops87025},
doi = {10.4230/LIPIcs.CPM.2018.7},
annote = {Keywords: LempelZiv 1977, Matching Statistics, RunLength Compressed BurrowsWheeler Transform}
}
18.05.2018
Keywords: 

LempelZiv 1977, Matching Statistics, RunLength Compressed BurrowsWheeler Transform 
Seminar: 

Annual Symposium on Combinatorial Pattern Matching (CPM 2018)

Issue date: 

2018 
Date of publication: 

18.05.2018 