Order-Preserving Pattern Matching Indeterminate Strings

Authors Rui Henriques, Alexandre P. Francisco, Luís M. S. Russo, Hideo Bannai



PDF
Thumbnail PDF

File

LIPIcs.CPM.2018.2.pdf
  • Filesize: 0.56 MB
  • 15 pages

Document Identifiers

Author Details

Rui Henriques
  • INESC-ID and Instituto Superior Técnico, Universidade de Lisboa, Portugal
Alexandre P. Francisco
  • INESC-ID and Instituto Superior Técnico, Universidade de Lisboa, Portugal
Luís M. S. Russo
  • INESC-ID and Instituto Superior Técnico, Universidade de Lisboa, Portugal
Hideo Bannai
  • Department of Computer Science, Kyushu University, Japan

Cite AsGet BibTex

Rui Henriques, Alexandre P. Francisco, Luís M. S. Russo, and Hideo Bannai. Order-Preserving Pattern Matching Indeterminate Strings. In 29th Annual Symposium on Combinatorial Pattern Matching (CPM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 105, pp. 2:1-2:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.CPM.2018.2

Abstract

Given an indeterminate string pattern p and an indeterminate string text t, the problem of order-preserving pattern matching with character uncertainties (muOPPM) is to find all substrings of t that satisfy one of the possible orderings defined by p. When the text and pattern are determinate strings, we are in the presence of the well-studied exact order-preserving pattern matching (OPPM) problem with diverse applications on time series analysis. Despite its relevance, the exact OPPM problem suffers from two major drawbacks: 1) the inability to deal with indetermination in the text, thus preventing the analysis of noisy time series; and 2) the inability to deal with indetermination in the pattern, thus imposing the strict satisfaction of the orders among all pattern positions. In this paper, we provide the first polynomial algorithms to answer the muOPPM problem when: 1) indetermination is observed on the pattern or text; and 2) indetermination is observed on both the pattern and the text and given by uncertainties between pairs of characters. First, given two strings with the same length m and O(r) uncertain characters per string position, we show that the muOPPM problem can be solved in O(mr lg r) time when one string is indeterminate and r in N^+ and in O(m^2) time when both strings are indeterminate and r=2. Second, given an indeterminate text string of length n, we show that muOPPM can be efficiently solved in polynomial time and linear space.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • Order-preserving pattern matching
  • Indeterminate string analysis
  • Generic pattern matching
  • Satisfiability

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amihood Amir, Yonatan Aumann, Piotr Indyk, Avivit Levy, and Ely Porat. Efficient computations of l_1 and l_infinity rearrangement distances. Theor. Comput. Sci., 410(43):4382-4390, 2009. URL: http://dx.doi.org/10.1016/j.tcs.2009.07.019.
  2. Amihood Amir, Yonatan Aumann, Gad M. Landau, Moshe Lewenstein, and Noa Lewenstein. Pattern matching with swaps. J. Algorithms, 37(2):247-266, 2000. URL: http://dx.doi.org/10.1006/jagm.2000.1120.
  3. Amihood Amir, Yonatan Aumann, Moshe Lewenstein, and Ely Porat. Function matching. SIAM Journal on Computing, 35(5):1007-1022, 2006. Google Scholar
  4. Amihood Amir, Richard Cole, Ramesh Hariharan, Moshe Lewenstein, and Ely Porat. Overlap matching. Inf. Comput., 181(1):57-74, 2003. URL: http://dx.doi.org/10.1016/S0890-5401(02)00035-4.
  5. Amihood Amir and Martin Farach. Efficient 2-dimensional approximate matching of half-rectangular figures. Inf. Comput., 118(1):1-11, 1995. URL: http://dx.doi.org/10.1006/inco.1995.1047.
  6. Amihood Amir, Martin Farach, and S. Muthukrishnan. Alphabet dependence in parameterized matching. Inf. Process. Lett., 49(3):111-115, 1994. URL: http://dx.doi.org/10.1016/0020-0190(94)90086-8.
  7. Amihood Amir, Ohad Lipsky, Ely Porat, and Julia Umanski. Approximate matching in the l1 metric. In CPM, volume 5, pages 91-103. Springer, 2005. Google Scholar
  8. Amihood Amir and Igor Nor. Generalized function matching. J. Discrete Algorithms, 5(3):514-523, 2007. URL: http://dx.doi.org/10.1016/j.jda.2006.10.001.
  9. Alberto Apostolico. General pattern matching. In Mikhail J. Atallah and Marina Blanton, editors, Algorithms and Theory of Computation Handbook, pages 15-15. Chapman &Hall/CRC, 2010. Google Scholar
  10. Bengt Aspvall, Michael F Plass, and Robert Endre Tarjan. A linear-time algorithm for testing the truth of certain quantified boolean formulas. Information Processing Letters, 8(3):121-123, 1979. Google Scholar
  11. Brenda S Baker. A theory of parameterized pattern matching: algorithms and applications. In ACM symposium on Theory of computing, pages 71-80. ACM, 1993. Google Scholar
  12. Djamal Belazzougui, A. Pierrot, M. Raffinot, and Stéphane Vialette. Single and multiple consecutive permutation motif search. In Int. Symposium on Algorithms and Computation, pages 66-77. Springer, 2013. Google Scholar
  13. Robert S Boyer and J Strother Moore. A fast string searching algorithm. Communications of the ACM, 20(10):762-772, 1977. Google Scholar
  14. Emilios Cambouropoulos, M. Crochemore, C. Iliopoulos, L. Mouchard, and Yoan Pinzon. Algorithms for computing approximate repetitions in musical sequences. Int. Journal of Computer Mathematics, 79(11):1135-1148, 2002. Google Scholar
  15. Domenico Cantone, Salvatore Cristofaro, and Simone Faro. An efficient algorithm for δ-approximate matching with α-bounded gaps in musical sequences. In IW on Experimental and Efficient Algorithms, pages 428-439. Springer, 2005. Google Scholar
  16. Domenico Cantone, Salvatore Cristofaro, and Simone Faro. On tuning the (δ, α)-sequential-sampling algorithm for δ-approximate matching with alpha-bounded gaps in musical sequences. In ISMIR, pages 454-459, 2005. Google Scholar
  17. Domenico Cantone, Simone Faro, and M Oguzhan Külekci. An efficient skip-search approach to the order-preserving pattern matching problem. In Stringology, pages 22-35, 2015. Google Scholar
  18. Tamanna Chhabra, Simone Faro, M. Oguzhan Külekci, and Jorma Tarhio. Engineering order-preserving pattern matching with SIMD parallelism. Softw., Pract. Exper., 47(5):731-739, 2017. URL: http://dx.doi.org/10.1002/spe.2433.
  19. Tamanna Chhabra, M Oguzhan Külekci, and Jorma Tarhio. Alternative algorithms for order-preserving matching. In Stringology, pages 36-46, 2015. Google Scholar
  20. Tamanna Chhabra and Jorma Tarhio. A filtration method for order-preserving matching. Inf. Process. Lett., 116(2):71-74, 2016. URL: http://dx.doi.org/10.1016/j.ipl.2015.10.005.
  21. Sukhyeun Cho, Joong Chae Na, Kunsoo Park, and Jeong Seop Sim. Fast order-preserving pattern matching. In Combinatorial Optimization and Applications, pages 295-305. Springer, 2013. Google Scholar
  22. Sukhyeun Cho, Joong Chae Na, Kunsoo Park, and Jeong Seop Sim. A fast algorithm for order-preserving pattern matching. Information Processing Letters, 115(2):397-402, 2015. Google Scholar
  23. Peter Clifford, Raphaël Clifford, and Costas Iliopoulos. Faster algorithms for δ, γ-matching and related problems. In Annual Symposium on Combinatorial Pattern Matching, pages 68-78. Springer, 2005. Google Scholar
  24. Raphaël Clifford and C Iliopoulos. Approximate string matching for music analysis. Soft Computing-A Fusion of Foundations, Methodologies and Applications, 8(9):597-603, 2004. Google Scholar
  25. Richard Cole, C. Iliopoulos, T. Lecroq, W. Plandowski, and Wojciech Rytter. On special families of morphisms related to δ-matching and don't care symbols. Information Processing Letters, 85(5):227-233, 2003. Google Scholar
  26. Maxime Crochemore, Costas S Iliopoulos, Thierry Lecroq, Wojciech Plandowski, and Wojciech Rytter. Three heuristics for delta-matching: delta-bm algorithms. In CPM, pages 178-189. Springer, 2002. Google Scholar
  27. Erik D Demaine, Alejandro López-Ortiz, and J Ian Munro. Adaptive set intersections, unions, and differences. In In Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA. Citeseer, 2000. Google Scholar
  28. Branislav Ďurian, Jan Holub, Hannu Peltola, and Jorma Tarhio. Improving practical exact string matching. Information Processing Letters, 110(4):148-152, 2010. Google Scholar
  29. Michael L. Fredman. On computing the length of longest increasing subsequences. Discrete Mathematics, 11(1):29-35, 1975. URL: http://dx.doi.org/10.1016/0012-365X(75)90103-X.
  30. Kimmo Fredriksson and Szymon Grabowski. Practical and optimal string matching. In SPIRE, volume 3772, pages 376-387. Springer, 2005. Google Scholar
  31. Kimmo Fredriksson and Szymon Grabowski. Efficient algorithms for pattern matching with general gaps, character classes, and transposition invariance. Information Retrieval, 11(4):335-357, 2008. Google Scholar
  32. Xianping Ge. Pattern matching in financial time series data. final project report for ICS, 278, 1998. Google Scholar
  33. Rui Henriques. Learning from High-Dimensional Data using Local Descriptive Models. PhD thesis, Instituto Superior Tecnico, Universidade de Lisboa, Lisboa, 2016. Google Scholar
  34. Rui Henriques, Cláudia Antunes, and SaraC. Madeira. Methods for the efficient discovery of large item-indexable sequential patterns. In New Frontiers in Mining Complex Patterns, volume 8399 of LNCS, pages 100-116. Springer International Publishing, 2014. Google Scholar
  35. Rui Henriques and Sara C Madeira. Bicspam: flexible biclustering using sequential patterns. BMC bioinformatics, 15(1):130, 2014. Google Scholar
  36. Rui Henriques and Ana Paiva. Seven principles to mine flexible behavior from physiological signals for effective emotion recognition and description in affective interactions. In PhyCS, pages 75-82, 2014. Google Scholar
  37. Jan Holub, William F. Smyth, and Shu Wang. Fast pattern-matching on indeterminate strings. J. Discrete Algorithms, 6(1):37-50, 2008. URL: http://dx.doi.org/10.1016/j.jda.2006.10.003.
  38. Shuichi Kawashima and Minoru Kanehisa. Aaindex: amino acid index database. Nucleic acids research, 28(1):374-374, 2000. Google Scholar
  39. Jinil Kim, Peter Eades, Rudolf Fleischer, Seok-Hee Hong, Costas S Iliopoulos, Kunsoo Park, Simon J Puglisi, and Takeshi Tokuyama. Order-preserving matching. Theoretical Computer Science, 525:68-79, 2014. Google Scholar
  40. Donald E Knuth, James H Morris, Jr, and Vaughan R Pratt. Fast pattern matching in strings. SIAM journal on computing, 6(2):323-350, 1977. Google Scholar
  41. Marcin Kubica, Tomasz Kulczyński, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. A linear time algorithm for consecutive permutation pattern matching. Information Processing Letters, 113(12):430-433, 2013. Google Scholar
  42. Inbok Lee, Raphaël Clifford, and Sung-Ryul Kim. Algorithms on extended (δ, γ)-matching. Computational Science and Its Applications-ICCSA 2006, pages 1137-1142, 2006. Google Scholar
  43. Inbok Lee, Juan Mendivelso, and Yoan J Pinzón. δγ-parameterized matching. In International Symposium on String Processing and Information Retrieval, pages 236-248. Springer, 2008. Google Scholar
  44. Ohad Lipsky and Ely Porat. Approximate matching in the l_infinity metric. Inf. Process. Lett., 105(4):138-140, 2008. URL: http://dx.doi.org/10.1016/j.ipl.2007.08.012.
  45. Juan Mendivelso, Inbok Lee, and Yoan J Pinzón. Approximate function matching under δ-and γ-distances. In SPIRE, pages 348-359. Springer, 2012. Google Scholar
  46. S Muthukrishnan. New results and open problems related to non-standard stringology. In Combinatorial Pattern Matching, pages 298-317. Springer, 1995. Google Scholar
  47. Ely Porat and Klim Efremenko. Approximating general metric distances between a pattern and a text. In ACM-SIAM symposium on Discrete algorithms, pages 419-427. SIAM, 2008. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail