Cartesian Tree Matching and Indexing

Authors Sung Gwan Park, Amihood Amir, Gad M. Landau, Kunsoo Park



PDF
Thumbnail PDF

File

LIPIcs.CPM.2019.16.pdf
  • Filesize: 0.81 MB
  • 14 pages

Document Identifiers

Author Details

Sung Gwan Park
  • Seoul National University, Korea
Amihood Amir
  • Bar-Ilan University, Israel
Gad M. Landau
  • University of Haifa, Israel
  • New York University, USA
Kunsoo Park
  • Seoul National University, Korea

Acknowledgements

S.G. Park and K. Park were supported by Institute for Information & communications Technology Promotion(IITP) grant funded by the Korea government(MSIT) (No. 2018-0-00551, Framework of Practical Algorithms for NP-hard Graph Problems). A. Amir and G.M. Landau were partially supported by the Israel Science Foundation grant 571/14, and Grant No. 2014028 from the United States-Israel Binational Science Foundation (BSF).

Cite AsGet BibTex

Sung Gwan Park, Amihood Amir, Gad M. Landau, and Kunsoo Park. Cartesian Tree Matching and Indexing. In 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 128, pp. 16:1-16:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.CPM.2019.16

Abstract

We introduce a new metric of match, called Cartesian tree matching, which means that two strings match if they have the same Cartesian trees. Based on Cartesian tree matching, we define single pattern matching for a text of length n and a pattern of length m, and multiple pattern matching for a text of length n and k patterns of total length m. We present an O(n+m) time algorithm for single pattern matching, and an O((n+m) log k) deterministic time or O(n+m) randomized time algorithm for multiple pattern matching. We also define an index data structure called Cartesian suffix tree, and present an O(n) randomized time algorithm to build the Cartesian suffix tree. Our efficient algorithms for Cartesian tree matching use a representation of the Cartesian tree, called the parent-distance representation.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • Cartesian tree matching
  • Pattern matching
  • Indexing
  • Parent-distance representation

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Alfred V. Aho and Margaret J. Corasick. Efficient String Matching: An Aid to Bibliographic Search. Commun. ACM, 18(6):333-340, 1975. URL: http://dx.doi.org/10.1145/360825.360855.
  2. Amihood Amir, Yonatan Aumann, Gad M. Landau, Moshe Lewenstein, and Noa Lewenstein. Pattern Matching with Swaps. J. Algorithms, 37(2):247-266, 2000. URL: http://dx.doi.org/10.1006/jagm.2000.1120.
  3. Amihood Amir, Richard Cole, Ramesh Hariharan, Moshe Lewenstein, and Ely Porat. Overlap matching. Inf. Comput., 181(1):57-74, 2003. URL: http://dx.doi.org/10.1016/S0890-5401(02)00035-4.
  4. Amihood Amir, Martin Farach, and S. Muthukrishnan. Alphabet Dependence in Parameterized Matching. Inf. Process. Lett., 49(3):111-115, 1994. URL: http://dx.doi.org/10.1016/0020-0190(94)90086-8.
  5. Amihood Amir, Moshe Lewenstein, and Ely Porat. Approximate swapped matching. Inf. Process. Lett., 83(1):33-39, 2002. URL: http://dx.doi.org/10.1016/S0020-0190(01)00302-7.
  6. Alberto Apostolico, Péter L. Erdös, and Moshe Lewenstein. Parameterized matching with mismatches. J. Discrete Algorithms, 5(1):135-140, 2007. URL: http://dx.doi.org/10.1016/j.jda.2006.03.014.
  7. Brenda S. Baker. A theory of parameterized pattern matching: algorithms and applications. In Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, May 16-18, 1993, San Diego, CA, USA, pages 71-80, 1993. URL: http://dx.doi.org/10.1145/167088.167115.
  8. Brenda S. Baker. Parameterized Duplication in Strings: Algorithms and an Application to Software Maintenance. SIAM J. Comput., 26(5):1343-1362, 1997. URL: http://dx.doi.org/10.1137/S0097539793246707.
  9. Peter Burcsi, Ferdinando Cicalese, Gabriele Fici, and Zsuzsanna Lipták. Algorithms for Jumbled Pattern Matching in Strings. Int. J. Found. Comput. Sci., 23(2):357-374, 2012. URL: http://dx.doi.org/10.1142/S0129054112400175.
  10. Peter Burcsi, Ferdinando Cicalese, Gabriele Fici, and Zsuzsanna Lipták. On Approximate Jumbled Pattern Matching in Strings. Theory Comput. Syst., 50(1):35-51, 2012. URL: http://dx.doi.org/10.1007/s00224-011-9344-5.
  11. Tamanna Chhabra, Emanuele Giaquinta, and Jorma Tarhio. Filtration Algorithms for Approximate Order-Preserving Matching. In String Processing and Information Retrieval - 22nd International Symposium, SPIRE 2015, London, UK, September 1-4, 2015, Proceedings, pages 177-187, 2015. URL: http://dx.doi.org/10.1007/978-3-319-23826-5_18.
  12. Richard Cole and Ramesh Hariharan. Faster suffix tree construction with missing suffix links. In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, May 21-23, 2000, Portland, OR, USA, pages 407-415, 2000. URL: http://dx.doi.org/10.1145/335305.335352.
  13. Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Alessio Langiu, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Order-preserving indexing. Theor. Comput. Sci., 638:122-135, 2016. URL: http://dx.doi.org/10.1016/j.tcs.2015.06.050.
  14. Erik D. Demaine, Gad M. Landau, and Oren Weimann. On Cartesian Trees and Range Minimum Queries. Algorithmica, 68(3):610-625, 2014. URL: http://dx.doi.org/10.1007/s00453-012-9683-x.
  15. Tak-Chung Fu, Korris Fu-Lai Chung, Robert Wing Pong Luk, and Chak-man Ng. Stock time series pattern matching: Template-based vs. rule-based approaches. Eng. Appl. of AI, 20(3):347-364, 2007. URL: http://dx.doi.org/10.1016/j.engappai.2006.07.003.
  16. Raffaele Giancarlo. A Generalization of the Suffix Tree to Square Matrices, with Applications. SIAM J. Comput., 24(3):520-562, 1995. URL: http://dx.doi.org/10.1137/S0097539792231982.
  17. Garance Gourdel, Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, Arseny M. Shur, and Tomasz Walen. String Periods in the Order-Preserving Model. In 35th Symposium on Theoretical Aspects of Computer Science, STACS 2018, February 28 to March 3, 2018, Caen, France, pages 38:1-38:16, 2018. URL: http://dx.doi.org/10.4230/LIPIcs.STACS.2018.38.
  18. Carmit Hazay, Moshe Lewenstein, and Dina Sokol. Approximate Parameterized Matching. In Algorithms - ESA 2004, 12th Annual European Symposium, Bergen, Norway, September 14-17, 2004, Proceedings, pages 414-425, 2004. URL: http://dx.doi.org/10.1007/978-3-540-30140-0_38.
  19. Jinil Kim, Amihood Amir, Joong Chae Na, Kunsoo Park, and Jeong Seop Sim. On Representations of Ternary Order Relations in Numeric Strings. Mathematics in Computer Science, 11(2):127-136, 2017. URL: http://dx.doi.org/10.1007/s11786-016-0282-0.
  20. Jinil Kim, Peter Eades, Rudolf Fleischer, Seok-Hee Hong, Costas S. Iliopoulos, Kunsoo Park, Simon J. Puglisi, and Takeshi Tokuyama. Order-preserving matching. Theor. Comput. Sci., 525:68-79, 2014. URL: http://dx.doi.org/10.1016/j.tcs.2013.10.006.
  21. Donald E. Knuth, James H. Morris Jr., and Vaughan R. Pratt. Fast Pattern Matching in Strings. SIAM J. Comput., 6(2):323-350, 1977. URL: http://dx.doi.org/10.1137/0206024.
  22. Marcin Kubica, Tomasz Kulczynski, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. A linear time algorithm for consecutive permutation pattern matching. Inf. Process. Lett., 113(12):430-433, 2013. URL: http://dx.doi.org/10.1016/j.ipl.2013.03.015.
  23. Taehyung Lee, Joong Chae Na, and Kunsoo Park. On-line construction of parameterized suffix trees for large alphabets. Inf. Process. Lett., 111(5):201-207, 2011. URL: http://dx.doi.org/10.1016/j.ipl.2010.11.017.
  24. Edward M. McCreight. A Space-Economical Suffix Tree Construction Algorithm. J. ACM, 23(2):262-272, 1976. URL: http://dx.doi.org/10.1145/321941.321946.
  25. Joong Chae Na, Raffaele Giancarlo, and Kunsoo Park. On-Line Construction of Two-Dimensional Suffix Trees in O(n² log n) Time. Algorithmica, 48(2):173-186, 2007. URL: http://dx.doi.org/10.1007/s00453-007-0063-x.
  26. Esko Ukkonen. On-Line Construction of Suffix Trees. Algorithmica, 14(3):249-260, 1995. URL: http://dx.doi.org/10.1007/BF01206331.
  27. Jean Vuillemin. A Unifying Look at Data Structures. Commun. ACM, 23(4):229-239, 1980. URL: http://dx.doi.org/10.1145/358841.358852.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail