License: Creative Commons Attribution 3.0 Unported license (CC-BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2020.26
URN: urn:nbn:de:0030-drops-121512
URL: https://drops.dagstuhl.de/opus/volltexte/2020/12151/
Go to the corresponding LIPIcs Volume Portal


Nakashima, Katsuhito ; Fujisato, Noriki ; Hendrian, Diptarama ; Nakashima, Yuto ; Yoshinaka, Ryo ; Inenaga, Shunsuke ; Bannai, Hideo ; Shinohara, Ayumi ; Takeda, Masayuki

DAWGs for Parameterized Matching: Online Construction and Related Indexing Structures

pdf-format:
LIPIcs-CPM-2020-26.pdf (1 MB)


Abstract

Two strings x and y over Σ ∪ Π of equal length are said to parameterized match (p-match) if there is a renaming bijection f:Σ ∪ Π → Σ ∪ Π that is identity on Σ and transforms x to y (or vice versa). The p-matching problem is to look for substrings in a text that p-match a given pattern. In this paper, we propose parameterized suffix automata (p-suffix automata) and parameterized directed acyclic word graphs (PDAWGs) which are the p-matching versions of suffix automata and DAWGs. While suffix automata and DAWGs are equivalent for standard strings, we show that p-suffix automata can have Θ(n²) nodes and edges but PDAWGs have only O(n) nodes and edges, where n is the length of an input string. We also give O(n |Π| log (|Π| + |Σ|))-time O(n)-space algorithm that builds the PDAWG in a left-to-right online manner. As a byproduct, it is shown that the parameterized suffix tree for the reversed string can also be built in the same time and space, in a right-to-left online manner.

BibTeX - Entry

@InProceedings{nakashima_et_al:LIPIcs:2020:12151,
  author =	{Katsuhito Nakashima and Noriki Fujisato and Diptarama Hendrian and Yuto Nakashima and Ryo Yoshinaka and Shunsuke Inenaga and Hideo Bannai and Ayumi Shinohara and Masayuki Takeda},
  title =	{{DAWGs for Parameterized Matching: Online Construction and Related Indexing Structures}},
  booktitle =	{31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
  pages =	{26:1--26:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-149-8},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{161},
  editor =	{Inge Li G{\o}rtz and Oren Weimann},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2020/12151},
  URN =		{urn:nbn:de:0030-drops-121512},
  doi =		{10.4230/LIPIcs.CPM.2020.26},
  annote =	{Keywords: parameterized matching, suffix trees, DAWGs, suffix automata}
}

Keywords: parameterized matching, suffix trees, DAWGs, suffix automata
Collection: 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)
Issue Date: 2020
Date of publication: 09.06.2020


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI