eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2016-06-27
2:1
2:12
10.4230/LIPIcs.CPM.2016.2
article
Space-Efficient Dictionaries for Parameterized and Order-Preserving Pattern Matching
Ganguly, Arnab
Hon, Wing-Kai
Sadakane, Kunihiko
Shah, Rahul
Thankachan, Sharma V.
Yang, Yilin
Let S and S' be two strings of the same length.We consider the following two variants of string matching.
* Parameterized Matching: The characters of S and S' are partitioned into static characters and parameterized characters.
The strings are parameterized match iff the static characters match exactly and there exists a one-to-one function which renames the parameterized characters in S to those in S'.
* Order-Preserving Matching: The strings are order-preserving match iff for any two integers i,j in [1,|S|], S[i] <= S[j] iff S'[i] <= S'[j].
Let P be a collection of d patterns {P_1, P_2, ..., P_d} of total length n characters, which are chosen from an alphabet Sigma.
Given a text T, also over Sigma, we consider the dictionary indexing problem under the above definitions of string matching.
Specifically, the task is to index P, such that we can report all positions j where at least one of the patterns P_i in P is a parameterized-match (resp. order-preserving match) with the same-length substring of $T$ starting at j. Previous best-known indexes occupy O(n * log(n)) bits and can report all occ positions in O(|T| * log(|Sigma|) + occ) time. We present space-efficient indexes that occupy O(n * log(|Sigma|+d) * log(n)) bits and reports all occ positions in O(|T| * (log(|Sigma|) + log_{|Sigma|}(n)) + occ) time for parameterized matching and in O(|T| * log(n) + occ) time for order-preserving matching.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol054-cpm2016/LIPIcs.CPM.2016.2/LIPIcs.CPM.2016.2.pdf
Parameterized Matching
Order-preserving Matching
Dictionary Indexing
Aho-Corasick Automaton
Sparsification