Document

# On Indeterminate Strings Matching

## File

LIPIcs.CPM.2020.14.pdf
• Filesize: 0.96 MB
• 14 pages

## Cite As

Paweł Gawrychowski, Samah Ghazawi, and Gad M. Landau. On Indeterminate Strings Matching. In 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 14:1-14:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.CPM.2020.14

## Abstract

Given two indeterminate equal-length strings p and t with a set of characters per position in both strings, we obtain a determinate string p_w from p and a determinate string t_w from t by choosing one character per position. Then, we say that p and t match when p_w and t_w match for some choice of the characters. While the most standard notion of a match for determinate strings is that they are simply identical, in certain applications it is more appropriate to use other definitions, with the prime examples being parameterized matching, order-preserving matching, and the recently introduced Cartesian tree matching. We provide a systematic study of the complexity of string matching for indeterminate equal-length strings, for different notions of matching. We use n to denote the length of both strings, and r to be an upper-bound on the number of uncertain characters per position. First, we provide the first polynomial time algorithm for the Cartesian tree version that runs in deterministic 𝒪(nlog² n) and expected 𝒪(nlog nlog log n) time using 𝒪(nlog n) space, for constant r. Second, we establish NP-hardness of the order-preserving version for r=2, thus solving a question explicitly stated by Henriques et al. [CPM 2018], who showed hardness for r=3. Third, we establish NP-hardness of the parameterized version for r=2. As both parameterized and order-preserving indeterminate matching reduce to the standard determinate matching for r=1, this provides a complete classification for these three variants.

## Subject Classification

##### ACM Subject Classification
• Theory of computation → Pattern matching
##### Keywords
• string matching
• indeterminate strings
• Cartesian trees
• order-preserving matching
• parameterized matching

## Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

## References

1. A. Alatabbi, A. S. M. Sohidull Islam, M. S. Rahman, R. J. Simpson, and W. F. Smyth. Enhanced covers of regular & indeterminate strings using prefix tables. Automata, Languages and Combinatorics, 21(3):131-147, 2016.
2. A. Amir, Y. Aumann, G. M. Landau, M. Lewenstein, and N. Lewenstein. Pattern matching with swaps. Journal of Algorithms, 37(2):247-266, 2000.
3. A. Amir, M. Farach, and S. Muthukrishnan. Alphabet dependence in parameterized matching. Information Processing Letters, 49(3):111-115, 1994.
4. A. Apostolico, Péter L. Erdös, and M. Lewenstein. Parameterized matching with mismatches. Discrete Algorithms, 5(1):135-140, 2007.
5. B. S. Baker. A theory of parameterized pattern matching: algorithms and applications. In 25th STOC, pages 71-80, 1993.
6. B. S. Baker. Parameterized pattern matching: Algorithms and applications. Journal of Computer and System Sciences, 52(1):28-42, 1996.
7. M. Bataa, S. G. Park, A. Amir, G. M. Landau, and K. Park. Finding periods in cartesian tree matching. In 30th IWOCA, volume 11638, pages 70-84, 2019.
8. G. Bernardini, P. Gawrychowski, N. Pisanti, S. P. Pissis, and G. Rosone. Even faster elastic-degenerate string matching via fast matrix multiplication. In 46th ICALP, pages 21:1-21:15, 2019.
9. P. Bille, I. Li Gørtz, H. W. Vildhøj, and D. K. Wind. String matching with variable length gaps. Theoretical Computer Science, 443:385-394, October 2010.
10. P. Burcsi, F. Cicalese, G. Fici, and Z. Lipták. Algorithms for jumbled pattern matching in strings. International Journal of Foundations of Computer Science, 23(02):357–374, 2012.
11. S. Cho, J. C. Na, K. Park, and J. S. Sim. A fast algorithm for order-preserving pattern matching. Information Processing Letters, 115(2):397-402, 2015.
12. M. Christodoulakis, P. J. Ryan, W. F. Smyth, and S. Wang. Indeterminate strings, prefix arrays & undirected graphs. Theoretical Computer Science, 600:34-48, 2015.
13. D. Costa, L. M. S. Russo, R. Henriques, H. Bannai, and A. P. Francisco. Order-preserving pattern matching indeterminate strings. In 30th CPM, 2019. URL: http://arxiv.org/abs/1905.02589.
14. M. Crochemore, C. S. Iliopoulos, T. Kociumaka, J. Radoszewski, W. Rytter, and T. Waleń. Covering problems for partial words and for indeterminate strings. In 25th ISAAC, pages 220-232, 2014.
15. J. W. Daykin, R. Groult, Y. Guesnet, T. Lecroq, A. Lefebvre, M. Léonard, L. Mouchard, É. Prieur-Gaston, and B. Watson. Efficient pattern matching in degenerate strings with the burrows-wheeler transform. Information Processing Letters, 147, 2017.
16. P. Gawrychowski and P. Uznański. Order-preserving pattern matching with k mismatches. Theoretical Computer Science, 638:136-144, 2016.
17. G. Gourdel, T. Kociumaka, J. Radoszewski, W. Rytter, A. Shur, and T. Waleń. String periods in the order-preserving model. In 35th STACS, volume 96, pages 1-16, 2018.
18. G. Gu, S. Song, S. Faro, T. Lecroq, and K. Park. Fast multiple pattern cartesian tree matching. In 14th WALCOM, pages 107-119, 2020.
19. C. Hazay, M. Lewenstein, and D. Sokol. Approximate parameterized matching. ACM Transactions on Algorithms (TALG), 3(3):29-44, 2007.
20. J. Helling, P. J. Ryan, W. F. Smyth, and M. Soltys. Constructing an indeterminate string from its associated graph. Theoretical Computer Science, 710, March 2017.
21. Rui Henriques, Alexandre P. Francisco, Luís M. S. Russo, and Hideo Bannai. Order-preserving pattern matching indeterminate strings. In 29th CPM, volume 105, pages 2:1-2:15, 2018.
22. J. Holub and W. F. Smyth. Algorithms on indeterminate strings. In 14th AWOCA, pages 36-45, 2003.
23. J. Holub, W. F. Smyth, and S. Wang. Fast pattern-matching on indeterminate strings. Discrete Algorithms, 6(1):37-50, 2008.
24. C. Iliopoulos, R. Kundu, and S. Pissis. Efficient pattern matching in elastic-degenerate strings. In 11th LATA, pages 131-142, 2017.
25. J. Kim, P. Eades, R. Fleischer, S. H. Hong, C. S. Iliopoulos, K. Park, S. J. Puglisi, and T. Tokuyama. Order-preserving matching. Theoretical Computer Science, 525:68-79, 2014.
26. M. Kubica, T. Kulczyński, J. Radoszewski, W. Rytter, and T. Waleń. A linear time algorithm for consecutive permutation pattern matching. Information Processing Letters, 113(12):430-433, 2013.
27. R. McIntyre and M. Soltys. An improved upper bound and algorithm for clique covers. J. Discrete Algorithms, 48:42-56, 2018.
28. S. G. Park, A. Amir, G. M. Landau, and K. Park. Cartesian tree matching and indexing. In 30th CPM, pages 16:1-16:14, 2019.
29. M. S. Rahman and C. S. Iliopoulos. Pattern matching algorithms with don't cares. In 33th SOFSEM, pages 116-126, 2007.
30. S. Song, C. Ryu, S. Faro, T. Lecroq, and K. Park. Fast cartesian tree matching. In 26th SPIRE, pages 124-137, 2019.
31. J. Vuillemin. A unifying look at data structures. Communications of the ACM, 23(4):229-239, 1980.
32. R. A. Wagner and M. J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168–173, 1974.
33. D. E. Willard. Log-logarithmic worst-case range queries are possible in space Θ(N). Information Processing Letters, 17(2):81-84, 1983.
34. B. Zeidman. Software v. software. IEEE Spectrum, 47:32-53, 2010.
X

Feedback for Dagstuhl Publishing