Reconstructing General Matching Graphs

Amir, Amihood; Itzhaki, Michael

doi:10.4230/LIPIcs.CPM.2024.2

Abstract

The classical pattern matching paradigm is that of seeking occurrences of one string in another, where both strings are drawn from an alphabet set Σ. Motivated by many applications, algorithms were developed for pattern matching where the matching relation is not necessarily the "=" relation. Examples are pattern matching with "don't cares", approximate matching, less-than matching, Cartesian-tree matching, order preserving matching, parameterized matching, degenerate matching, function matching, and more. Some of the matchings above allow for efficient pattern matching algorithms, while others do not. Much work has not been done on categorization of the complexity of various string matching queries based on the type of matching. For example, when can exact matching be done fast? When can approximate matching be calculated fast? When can tandem or palindrome recognition be efficiently calculated? This paper defines the matching graph of a given string under a matching relation. We show that the type of graph affects various string algorithms. The matching graph can also be a tool for lower bounds. We provide a lower bound for finding palindromes in a general degenerate graph. We also show some results in recognizing the minimum alphabet required for reconstructing a string that presents a given matching graph.

K. Abrahamson. Generalized string matching. SIAM J. Comp., 16(6):1039-1051, 1987.
M. Alzamel, L. A. K. Ayad, G. Bernardini, R. Grossi, C. S. Iliopoulos, N. Pisanti, S. P. Pissis, and G. Rosone. Degenerate string comparison and applications. In Proc. 18th International Workshop on Algorithms in Bioinformatics (WABI), volume 113 of LIPIcs, pages 21:1-21:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018. URL: https://doi.org/10.4230/LIPICS.WABI.2018.21.
M. Alzamel, L. A. K. Ayad, G. Bernardini, R. Grossi, C. S. Iliopoulos, N. Pisanti, S. P. Pissis, and G. Rosone. Comparing degenerate strings. Fundam. Informaticae, 175(1-4):41-58, 2020. URL: https://doi.org/10.3233/FI-2020-1947.
M. Alzamel, C. Hampson, C. S. Iliopoulos, Z. Lim, S. P. Pissis, D. Vlachakis, and S. Watts. Maximal degenerate palindromes with gaps and mismatches. Theor. Comput. Sci., 978:114182, 2023. URL: https://doi.org/10.1016/J.TCS.2023.114182.
A. Amir, A. Aumann, M. Lewenstein, and E. Porat. Function matching. SIAM Journal on Computing, 35(5):1007-1022, 2006.
A. Amir, G. Benson, and M. Farach. An alphabet independent approach to two dimensional pattern matching. SIAM J. Comp., 23(2):313-323, 1994.
A. Amir, K. W. Church, and E. Dar. Separable attributes: a technique for solving the submatrices character count problem. In Proc. 13th ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 400-401, 2002.
A. Amir and M. Farach. Efficient 2-dimensional approximate matching of half-rectangular figures. Information and Computation, 118(1):1-11, April 1995.
A. Amir, M. Farach, and S. Muthukrishnan. Alphabet dependence in parameterized matching. Information Processing Letters, 49:111-115, 1994.
A. Amir, E. Kondratovsky, G. M. Landau, S. Marcus, and D. Sokol. Reconstructing parameterized strings from parameterized suffix and LCP arrays. Theor. Comput. Sci., 981:114230, 2024. URL: https://doi.org/10.1016/J.TCS.2023.114230.
A. Amir, E. Kondratovsky, and A. Levy. On suffix tree detection. In Proc. 30th Int. Symp. on String Processing and Information Retrieval (SPIRE), volume 14240 of Lecture Notes in Computer Science, pages 14-27. Springer, 2023. URL: https://doi.org/10.1007/978-3-031-43980-3_2.
A. Amir, M. Lewenstein, and E. Porat. Approximate subset matching with "don't care"s. In Proc. 12th ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 305-306, 2001.
A. Amir, M. Lewenstein, and E. Porat. Faster algorithms for string matching with k mismatches. J. Algorithms, 50(2):257-275, 2004.
A. Amir and I. Nor. Generalized function matching. J. of Discrete Algorithms, 5(3):514-523, 2007.
O. Amir, A. Amir, D. Sarne, and A. Fraenkel. On the practical power of automata in pattern matching. SN Computer Science, 2024. to appear.
A. Apostolico and Z. Galil (editors). Pattern Matching Algorithms. Oxford University Press, 1997.
A. Apostolico, M. Lewenstein, and P. Erdös. Parameterized matching with mismatches. Journal of Discrete Algorithms, 5(1):135-140, 2007.
G.P. Babu, B.M. Mehtre, and M.S. Kankanhalli. Color indexing for efficient image retrieval. Multimedia Tools and Applications, 1(4):327-348, November 1995.
Arturs Backurs and Piotr Indyk. Which regular expression patterns are hard to match? In IEEE 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, pages 457-466, December 2016. URL: https://doi.org/10.1109/FOCS.2016.56.
B. S. Baker. A theory of parameterized pattern matching: algorithms and applications. In Proc. 25th Annual ACM Symposium on the Theory of Computation, pages 71-80, 1993.
B. S. Baker. Parameterized pattern matching: Algorithms and applications. Journal of Computer and System Sciences, 52(1):28-42, 1996.
B. S. Baker. Parameterized duplication in strings: Algorithms and an application to software maintenance. SIAM Journal on Computing, 26(5):1343-1362, 1997.
G. Bernardini, E. Gabory, S. P. Pissis, L. Stougie, M. Sweering, and V. Zuba. Elastic-degenerate string matching with 1 error. In Proc. 15th Latin American symposium on Theoretical Informatics (LATIN), volume 13568 of Lecture Notes in Computer Science, pages 20-37. Springer, 2022. URL: https://doi.org/10.1007/978-3-031-20624-5_2.
G. Bernardini, P. Gawrychowski, N. Pisanti, S. P. Pissis, and G. Rosone. Even faster elastic-degenerate string matching via fast matrix multiplication. In Proc. 46th International Colloquium on Automata, Languages, and Programming (ICALP), volume 132 of LIPIcs, pages 21:1-21:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019. URL: https://doi.org/10.4230/LIPICS.ICALP.2019.21.
G. Bernardini, P. Gawrychowski, N. Pisanti, S. P. Pissis, and G. Rosone. Elastic-degenerate string matching via fast matrix multiplication. SIAM J. Comput., 51(3):549-576, 2022. URL: https://doi.org/10.1137/20M1368033.
G. Bernardini, N. Pisanti, S. P. Pissis, and G. Rosone. Approximate pattern matching on elastic-degenerate text. Theor. Comput. Sci., 812:109-122, 2020. URL: https://doi.org/10.1016/J.TCS.2019.08.012.
R.S. Boyer and J.S. Moore. A fast string searching algorithm. Comm. ACM, 20:762-772, 1977.
P. Clifford and R. Clifford. Simple deterministic wildcard matching. Information Processing Letters, 101(2):53-54, 2007.
M. Crochemore, C. Hancart, and T. Lecroq. Algorithms on Strings. Cambridge University Press, 2007.
M. Crochemore, C. S. Iliopoulos, T. Kociumaka, M. Kubica, A. Langiu, S. P. Pissis, J. Radoszewski, W. Rytter, and T. Walen. Order-preserving indexing. Theor. Comput. Sci., 638:122-135, 2016.
M. Crochemore, C. S. Iliopoulos, R. Kundu, M. Mohamed, and F. Vayani. Linear algorithm for conservative degenerate pattern matching. Eng. Appl. Artif. Intell., 51:109-114, 2016. URL: https://doi.org/10.1016/J.ENGAPPAI.2016.01.009.
M. Crochemore and W. Rytter. Text Algorithms. Oxford University Press, 1994.
J.P. Duval, T. Lecroq, and A. Lefebvre. Efficient validation and construction of border arrays and validation of string matching automata. RAIRO Theor. Informatics Appl., 43(2):281-297, 2009.
M.J. Fischer and M.S. Paterson. String matching and other products. Complexity of Computation, R.M. Karp (editor), SIAM-AMS Proceedings, 7:113-125, 1974.
E. Gabory, N. M. Mwaniki, N. Pisanti, S. P. Pissis, J. Radoszewski, M. Sweering, and W. Zuba. Comparing elastic-degenerate strings: Algorithms, lower bounds, and applications. In 34th Symp. on Combinatorial Pattern Matching, CPM, volume 259 of LIPIcs, pages 11:1-11:20. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023. URL: https://doi.org/10.4230/LIPICS.CPM.2023.11.
P. Gawrychowski, A. Jez, and L. Jez. Validating the knuth-morris-pratt failure function, fast and online. Theory Comput. Syst., 54(2):337-372, 2014.
C. Hazay, M. Lewenstein, and D. Sokol. Approximate parameterized matching. In Proc. 12th Annual European Symposium on Algorithms (ESA 2004), pages 414-425, 2004.
J. Holub, W. F. Smyth, and S. Wang. Fast pattern-matching on indeterminate strings. J. Discrete Algorithms, 6(1):37-50, 2008.
T. I, S. Inenaga, H. Bannai, and M. Takeda. Verifying and enumerating parameterized border arrays. Theor. Comput. Sci., 412(50):6959-6981, 2011.
R.M. Idury and A.A Schäffer. Multiple matching of parameterized patterns. In Proc. 5th Combinatorial Pattern Matching (CPM), volume 807 of LNCS, pages 226-239. Springer-Verlag, 1994.
C. S. Iliopoulos, R. Kundu, and S. P. Pissis. Efficient pattern matching in elastic-degenerate strings. Inf. Comput., 279:104616, 2021. URL: https://doi.org/10.1016/J.IC.2020.104616.
J. Kärkkäinen, M. Piatkowski, and S. J. Puglisi. String inference from longest-common-prefix array. In Proc. 44th Intl. Coll. on Automata, Languages, and Programming, ICALP, volume 80 of LIPIcs, pages 62:1-62:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017.
D.E. Knuth, J.H. Morris, and V.R. Pratt. Fast pattern matching in strings. SIAM J. Comp., 6:323-350, 1977.
G.M. Landau and U. Vishkin. Efficient string matching in the presence of errors. Proc. 26th IEEE FOCS, pages 126-126, 1985.
Y. Nakashima, T. Okabe, T. I, S. Inenaga, H. Bannai, and M. Takeda. Inferring strings from lyndon factorization. Theor. Comput. Sci., 689:147-156, 2017.
S.G. Park, M. Bataa, A. Amir, G.M. Landau, and K. Park. Finding patterns and periods in cartesian tree matching. Theoretical Computer Sciencr, 845:181-197, 2020.
M. Swain and D. Ballard. Color indexing. International Journal of Computer Vision, 7(1):11-32, 1991.

Reconstructing General Matching Graphs

Authors Amihood Amir, Michael Itzhaki

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Reconstructing General Matching Graphs

Authors Amihood Amir, Michael Itzhaki

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message