On Indeterminate Strings Matching

Authors Paweł Gawrychowski, Samah Ghazawi, Gad M. Landau



PDF
Thumbnail PDF

File

LIPIcs.CPM.2020.14.pdf
  • Filesize: 0.96 MB
  • 14 pages

Document Identifiers

Author Details

Paweł Gawrychowski
  • Institute of Computer Science, University of Wrocław, Poland
Samah Ghazawi
  • Department of Computer Science, University of Haifa, Israel
Gad M. Landau
  • Department of Computer Science, University of Haifa, Israel
  • Department of Computer Science and Engineering, NYU Tandon School of Engineering, Brooklyn, NY, USA

Cite AsGet BibTex

Paweł Gawrychowski, Samah Ghazawi, and Gad M. Landau. On Indeterminate Strings Matching. In 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 14:1-14:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.CPM.2020.14

Abstract

Given two indeterminate equal-length strings p and t with a set of characters per position in both strings, we obtain a determinate string p_w from p and a determinate string t_w from t by choosing one character per position. Then, we say that p and t match when p_w and t_w match for some choice of the characters. While the most standard notion of a match for determinate strings is that they are simply identical, in certain applications it is more appropriate to use other definitions, with the prime examples being parameterized matching, order-preserving matching, and the recently introduced Cartesian tree matching. We provide a systematic study of the complexity of string matching for indeterminate equal-length strings, for different notions of matching. We use n to denote the length of both strings, and r to be an upper-bound on the number of uncertain characters per position. First, we provide the first polynomial time algorithm for the Cartesian tree version that runs in deterministic 𝒪(nlog² n) and expected 𝒪(nlog nlog log n) time using 𝒪(nlog n) space, for constant r. Second, we establish NP-hardness of the order-preserving version for r=2, thus solving a question explicitly stated by Henriques et al. [CPM 2018], who showed hardness for r=3. Third, we establish NP-hardness of the parameterized version for r=2. As both parameterized and order-preserving indeterminate matching reduce to the standard determinate matching for r=1, this provides a complete classification for these three variants.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • string matching
  • indeterminate strings
  • Cartesian trees
  • order-preserving matching
  • parameterized matching

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. A. Alatabbi, A. S. M. Sohidull Islam, M. S. Rahman, R. J. Simpson, and W. F. Smyth. Enhanced covers of regular & indeterminate strings using prefix tables. Automata, Languages and Combinatorics, 21(3):131-147, 2016. Google Scholar
  2. A. Amir, Y. Aumann, G. M. Landau, M. Lewenstein, and N. Lewenstein. Pattern matching with swaps. Journal of Algorithms, 37(2):247-266, 2000. Google Scholar
  3. A. Amir, M. Farach, and S. Muthukrishnan. Alphabet dependence in parameterized matching. Information Processing Letters, 49(3):111-115, 1994. Google Scholar
  4. A. Apostolico, Péter L. Erdös, and M. Lewenstein. Parameterized matching with mismatches. Discrete Algorithms, 5(1):135-140, 2007. Google Scholar
  5. B. S. Baker. A theory of parameterized pattern matching: algorithms and applications. In 25th STOC, pages 71-80, 1993. Google Scholar
  6. B. S. Baker. Parameterized pattern matching: Algorithms and applications. Journal of Computer and System Sciences, 52(1):28-42, 1996. Google Scholar
  7. M. Bataa, S. G. Park, A. Amir, G. M. Landau, and K. Park. Finding periods in cartesian tree matching. In 30th IWOCA, volume 11638, pages 70-84, 2019. Google Scholar
  8. G. Bernardini, P. Gawrychowski, N. Pisanti, S. P. Pissis, and G. Rosone. Even faster elastic-degenerate string matching via fast matrix multiplication. In 46th ICALP, pages 21:1-21:15, 2019. Google Scholar
  9. P. Bille, I. Li Gørtz, H. W. Vildhøj, and D. K. Wind. String matching with variable length gaps. Theoretical Computer Science, 443:385-394, October 2010. Google Scholar
  10. P. Burcsi, F. Cicalese, G. Fici, and Z. Lipták. Algorithms for jumbled pattern matching in strings. International Journal of Foundations of Computer Science, 23(02):357–374, 2012. Google Scholar
  11. S. Cho, J. C. Na, K. Park, and J. S. Sim. A fast algorithm for order-preserving pattern matching. Information Processing Letters, 115(2):397-402, 2015. Google Scholar
  12. M. Christodoulakis, P. J. Ryan, W. F. Smyth, and S. Wang. Indeterminate strings, prefix arrays & undirected graphs. Theoretical Computer Science, 600:34-48, 2015. Google Scholar
  13. D. Costa, L. M. S. Russo, R. Henriques, H. Bannai, and A. P. Francisco. Order-preserving pattern matching indeterminate strings. In 30th CPM, 2019. URL: http://arxiv.org/abs/1905.02589.
  14. M. Crochemore, C. S. Iliopoulos, T. Kociumaka, J. Radoszewski, W. Rytter, and T. Waleń. Covering problems for partial words and for indeterminate strings. In 25th ISAAC, pages 220-232, 2014. Google Scholar
  15. J. W. Daykin, R. Groult, Y. Guesnet, T. Lecroq, A. Lefebvre, M. Léonard, L. Mouchard, É. Prieur-Gaston, and B. Watson. Efficient pattern matching in degenerate strings with the burrows-wheeler transform. Information Processing Letters, 147, 2017. Google Scholar
  16. P. Gawrychowski and P. Uznański. Order-preserving pattern matching with k mismatches. Theoretical Computer Science, 638:136-144, 2016. Google Scholar
  17. G. Gourdel, T. Kociumaka, J. Radoszewski, W. Rytter, A. Shur, and T. Waleń. String periods in the order-preserving model. In 35th STACS, volume 96, pages 1-16, 2018. Google Scholar
  18. G. Gu, S. Song, S. Faro, T. Lecroq, and K. Park. Fast multiple pattern cartesian tree matching. In 14th WALCOM, pages 107-119, 2020. Google Scholar
  19. C. Hazay, M. Lewenstein, and D. Sokol. Approximate parameterized matching. ACM Transactions on Algorithms (TALG), 3(3):29-44, 2007. Google Scholar
  20. J. Helling, P. J. Ryan, W. F. Smyth, and M. Soltys. Constructing an indeterminate string from its associated graph. Theoretical Computer Science, 710, March 2017. Google Scholar
  21. Rui Henriques, Alexandre P. Francisco, Luís M. S. Russo, and Hideo Bannai. Order-preserving pattern matching indeterminate strings. In 29th CPM, volume 105, pages 2:1-2:15, 2018. Google Scholar
  22. J. Holub and W. F. Smyth. Algorithms on indeterminate strings. In 14th AWOCA, pages 36-45, 2003. Google Scholar
  23. J. Holub, W. F. Smyth, and S. Wang. Fast pattern-matching on indeterminate strings. Discrete Algorithms, 6(1):37-50, 2008. Google Scholar
  24. C. Iliopoulos, R. Kundu, and S. Pissis. Efficient pattern matching in elastic-degenerate strings. In 11th LATA, pages 131-142, 2017. Google Scholar
  25. J. Kim, P. Eades, R. Fleischer, S. H. Hong, C. S. Iliopoulos, K. Park, S. J. Puglisi, and T. Tokuyama. Order-preserving matching. Theoretical Computer Science, 525:68-79, 2014. Google Scholar
  26. M. Kubica, T. Kulczyński, J. Radoszewski, W. Rytter, and T. Waleń. A linear time algorithm for consecutive permutation pattern matching. Information Processing Letters, 113(12):430-433, 2013. Google Scholar
  27. R. McIntyre and M. Soltys. An improved upper bound and algorithm for clique covers. J. Discrete Algorithms, 48:42-56, 2018. Google Scholar
  28. S. G. Park, A. Amir, G. M. Landau, and K. Park. Cartesian tree matching and indexing. In 30th CPM, pages 16:1-16:14, 2019. Google Scholar
  29. M. S. Rahman and C. S. Iliopoulos. Pattern matching algorithms with don't cares. In 33th SOFSEM, pages 116-126, 2007. Google Scholar
  30. S. Song, C. Ryu, S. Faro, T. Lecroq, and K. Park. Fast cartesian tree matching. In 26th SPIRE, pages 124-137, 2019. Google Scholar
  31. J. Vuillemin. A unifying look at data structures. Communications of the ACM, 23(4):229-239, 1980. Google Scholar
  32. R. A. Wagner and M. J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168–173, 1974. Google Scholar
  33. D. E. Willard. Log-logarithmic worst-case range queries are possible in space Θ(N). Information Processing Letters, 17(2):81-84, 1983. Google Scholar
  34. B. Zeidman. Software v. software. IEEE Spectrum, 47:32-53, 2010. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail