Search Results

Documents authored by Denas, Olgert


Document
Fast matching statistics in small space

Authors: Djamal Belazzougui, Fabio Cunial, and Olgert Denas

Published in: LIPIcs, Volume 103, 17th International Symposium on Experimental Algorithms (SEA 2018)


Abstract
Computing the matching statistics of a string S with respect to a string T on an alphabet of size sigma is a fundamental primitive for a number of large-scale string analysis applications, including the comparison of entire genomes, for which space is a pressing issue. This paper takes from theory to practice an existing algorithm that uses just O(|T|log{sigma}) bits of space, and that computes a compact encoding of the matching statistics array in O(|S|log{sigma}) time. The techniques used to speed up the algorithm are of general interest, since they optimize queries on the existence of a Weiner link from a node of the suffix tree, and parent operations after unsuccessful Weiner links. Thus, they can be applied to other matching statistics algorithms, as well as to any suffix tree traversal that relies on such calls. Some of our optimizations yield a matching statistics implementation that is up to three times faster than a plain version of the algorithm, depending on the similarity between S and T. In genomic datasets of practical significance we achieve speedups of up to 1.8, but our fastest implementations take on average twice the time of an existing code based on the LCP array. The key advantage is that our implementations need between one half and one fifth of the competitor's memory, and they approach comparable running times when S and T are very similar.

Cite as

Djamal Belazzougui, Fabio Cunial, and Olgert Denas. Fast matching statistics in small space. In 17th International Symposium on Experimental Algorithms (SEA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 103, pp. 17:1-17:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{belazzougui_et_al:LIPIcs.SEA.2018.17,
  author =	{Belazzougui, Djamal and Cunial, Fabio and Denas, Olgert},
  title =	{{Fast matching statistics in small space}},
  booktitle =	{17th International Symposium on Experimental Algorithms (SEA 2018)},
  pages =	{17:1--17:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-070-5},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{103},
  editor =	{D'Angelo, Gianlorenzo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2018.17},
  URN =		{urn:nbn:de:0030-drops-89528},
  doi =		{10.4230/LIPIcs.SEA.2018.17},
  annote =	{Keywords: Matching statistics, maximal repeat, Burrows-Wheeler transform, wavelet tree, suffix tree topology}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail