Schloss Dagstuhl - Leibniz-Zentrum fΓΌr Informatik GmbH Schloss Dagstuhl - Leibniz-Zentrum fΓΌr Informatik GmbH scholarly article en Nellore, Abhinav; Nguyen, Austin; Thompson, Reid F. https://www.dagstuhl.de/lipics License: Creative Commons Attribution 4.0 license (CC BY 4.0)
when quoting this document, please refer to the following
DOI:
URN: urn:nbn:de:0030-drops-139717
URL:

; ;

An Invertible Transform for Efficient String Matching in Labeled Digraphs

pdf-format:


Abstract

Let G = (V, E) be a digraph where each vertex is unlabeled, each edge is labeled by a character in some alphabet Ξ©, and any two edges with both the same head and the same tail have different labels. The powerset construction gives a transform of G into a weakly connected digraph G' = (V', E') that enables solving the decision problem of whether there is a walk in G matching an arbitrarily long query string q in time linear in |q| and independent of |E| and |V|. We show G is uniquely determined by G' when for every v_𝓁 ∈ V, there is some distinct string s_𝓁 on Ξ© such that v_𝓁 is the origin of a closed walk in G matching s_𝓁, and no other walk in G matches s_𝓁 unless it starts and ends at v_𝓁. We then exploit this invertibility condition to strategically alter any G so its transform G' enables retrieval of all t terminal vertices of walks in the unaltered G matching q in O(|q| + t log |V|) time. We conclude by proposing two defining properties of a class of transforms that includes the Burrows-Wheeler transform and the transform presented here.

BibTeX - Entry

@InProceedings{nellore_et_al:LIPIcs.CPM.2021.20,
  author =	{Nellore, Abhinav and Nguyen, Austin and Thompson, Reid F.},
  title =	{{An Invertible Transform for Efficient String Matching in Labeled Digraphs}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{20:1--20:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2021/13971},
  URN =		{urn:nbn:de:0030-drops-139717},
  doi =		{10.4230/LIPIcs.CPM.2021.20},
  annote =	{Keywords: pattern matching, string matching, Burrows-Wheeler transform, labeled graphs}
}

Keywords: pattern matching, string matching, Burrows-Wheeler transform, labeled graphs
Seminar: 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)
Issue date: 2021
Date of publication: 30.06.2021


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI