Document

Order-Preserving Pattern Matching Indeterminate Strings

File

LIPIcs.CPM.2018.2.pdf
• Filesize: 0.56 MB
• 15 pages

Cite As

Rui Henriques, Alexandre P. Francisco, Luís M. S. Russo, and Hideo Bannai. Order-Preserving Pattern Matching Indeterminate Strings. In 29th Annual Symposium on Combinatorial Pattern Matching (CPM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 105, pp. 2:1-2:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.CPM.2018.2

Abstract

Given an indeterminate string pattern p and an indeterminate string text t, the problem of order-preserving pattern matching with character uncertainties (muOPPM) is to find all substrings of t that satisfy one of the possible orderings defined by p. When the text and pattern are determinate strings, we are in the presence of the well-studied exact order-preserving pattern matching (OPPM) problem with diverse applications on time series analysis. Despite its relevance, the exact OPPM problem suffers from two major drawbacks: 1) the inability to deal with indetermination in the text, thus preventing the analysis of noisy time series; and 2) the inability to deal with indetermination in the pattern, thus imposing the strict satisfaction of the orders among all pattern positions. In this paper, we provide the first polynomial algorithms to answer the muOPPM problem when: 1) indetermination is observed on the pattern or text; and 2) indetermination is observed on both the pattern and the text and given by uncertainties between pairs of characters. First, given two strings with the same length m and O(r) uncertain characters per string position, we show that the muOPPM problem can be solved in O(mr lg r) time when one string is indeterminate and r in N^+ and in O(m^2) time when both strings are indeterminate and r=2. Second, given an indeterminate text string of length n, we show that muOPPM can be efficiently solved in polynomial time and linear space.

Subject Classification

ACM Subject Classification
• Theory of computation → Pattern matching
Keywords
• Order-preserving pattern matching
• Indeterminate string analysis
• Generic pattern matching
• Satisfiability

Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

References

1. Amihood Amir, Yonatan Aumann, Piotr Indyk, Avivit Levy, and Ely Porat. Efficient computations of l_1 and l_infinity rearrangement distances. Theor. Comput. Sci., 410(43):4382-4390, 2009. URL: http://dx.doi.org/10.1016/j.tcs.2009.07.019.
2. Amihood Amir, Yonatan Aumann, Gad M. Landau, Moshe Lewenstein, and Noa Lewenstein. Pattern matching with swaps. J. Algorithms, 37(2):247-266, 2000. URL: http://dx.doi.org/10.1006/jagm.2000.1120.
3. Amihood Amir, Yonatan Aumann, Moshe Lewenstein, and Ely Porat. Function matching. SIAM Journal on Computing, 35(5):1007-1022, 2006.
4. Amihood Amir, Richard Cole, Ramesh Hariharan, Moshe Lewenstein, and Ely Porat. Overlap matching. Inf. Comput., 181(1):57-74, 2003. URL: http://dx.doi.org/10.1016/S0890-5401(02)00035-4.
5. Amihood Amir and Martin Farach. Efficient 2-dimensional approximate matching of half-rectangular figures. Inf. Comput., 118(1):1-11, 1995. URL: http://dx.doi.org/10.1006/inco.1995.1047.
6. Amihood Amir, Martin Farach, and S. Muthukrishnan. Alphabet dependence in parameterized matching. Inf. Process. Lett., 49(3):111-115, 1994. URL: http://dx.doi.org/10.1016/0020-0190(94)90086-8.
7. Amihood Amir, Ohad Lipsky, Ely Porat, and Julia Umanski. Approximate matching in the l1 metric. In CPM, volume 5, pages 91-103. Springer, 2005.
8. Amihood Amir and Igor Nor. Generalized function matching. J. Discrete Algorithms, 5(3):514-523, 2007. URL: http://dx.doi.org/10.1016/j.jda.2006.10.001.
9. Alberto Apostolico. General pattern matching. In Mikhail J. Atallah and Marina Blanton, editors, Algorithms and Theory of Computation Handbook, pages 15-15. Chapman &Hall/CRC, 2010.
10. Bengt Aspvall, Michael F Plass, and Robert Endre Tarjan. A linear-time algorithm for testing the truth of certain quantified boolean formulas. Information Processing Letters, 8(3):121-123, 1979.
11. Brenda S Baker. A theory of parameterized pattern matching: algorithms and applications. In ACM symposium on Theory of computing, pages 71-80. ACM, 1993.
12. Djamal Belazzougui, A. Pierrot, M. Raffinot, and Stéphane Vialette. Single and multiple consecutive permutation motif search. In Int. Symposium on Algorithms and Computation, pages 66-77. Springer, 2013.
13. Robert S Boyer and J Strother Moore. A fast string searching algorithm. Communications of the ACM, 20(10):762-772, 1977.
14. Emilios Cambouropoulos, M. Crochemore, C. Iliopoulos, L. Mouchard, and Yoan Pinzon. Algorithms for computing approximate repetitions in musical sequences. Int. Journal of Computer Mathematics, 79(11):1135-1148, 2002.
15. Domenico Cantone, Salvatore Cristofaro, and Simone Faro. An efficient algorithm for δ-approximate matching with α-bounded gaps in musical sequences. In IW on Experimental and Efficient Algorithms, pages 428-439. Springer, 2005.
16. Domenico Cantone, Salvatore Cristofaro, and Simone Faro. On tuning the (δ, α)-sequential-sampling algorithm for δ-approximate matching with alpha-bounded gaps in musical sequences. In ISMIR, pages 454-459, 2005.
17. Domenico Cantone, Simone Faro, and M Oguzhan Külekci. An efficient skip-search approach to the order-preserving pattern matching problem. In Stringology, pages 22-35, 2015.
18. Tamanna Chhabra, Simone Faro, M. Oguzhan Külekci, and Jorma Tarhio. Engineering order-preserving pattern matching with SIMD parallelism. Softw., Pract. Exper., 47(5):731-739, 2017. URL: http://dx.doi.org/10.1002/spe.2433.
19. Tamanna Chhabra, M Oguzhan Külekci, and Jorma Tarhio. Alternative algorithms for order-preserving matching. In Stringology, pages 36-46, 2015.
20. Tamanna Chhabra and Jorma Tarhio. A filtration method for order-preserving matching. Inf. Process. Lett., 116(2):71-74, 2016. URL: http://dx.doi.org/10.1016/j.ipl.2015.10.005.
21. Sukhyeun Cho, Joong Chae Na, Kunsoo Park, and Jeong Seop Sim. Fast order-preserving pattern matching. In Combinatorial Optimization and Applications, pages 295-305. Springer, 2013.
22. Sukhyeun Cho, Joong Chae Na, Kunsoo Park, and Jeong Seop Sim. A fast algorithm for order-preserving pattern matching. Information Processing Letters, 115(2):397-402, 2015.
23. Peter Clifford, Raphaël Clifford, and Costas Iliopoulos. Faster algorithms for δ, γ-matching and related problems. In Annual Symposium on Combinatorial Pattern Matching, pages 68-78. Springer, 2005.
24. Raphaël Clifford and C Iliopoulos. Approximate string matching for music analysis. Soft Computing-A Fusion of Foundations, Methodologies and Applications, 8(9):597-603, 2004.
25. Richard Cole, C. Iliopoulos, T. Lecroq, W. Plandowski, and Wojciech Rytter. On special families of morphisms related to δ-matching and don't care symbols. Information Processing Letters, 85(5):227-233, 2003.
26. Maxime Crochemore, Costas S Iliopoulos, Thierry Lecroq, Wojciech Plandowski, and Wojciech Rytter. Three heuristics for delta-matching: delta-bm algorithms. In CPM, pages 178-189. Springer, 2002.
27. Erik D Demaine, Alejandro López-Ortiz, and J Ian Munro. Adaptive set intersections, unions, and differences. In In Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA. Citeseer, 2000.
28. Branislav Ďurian, Jan Holub, Hannu Peltola, and Jorma Tarhio. Improving practical exact string matching. Information Processing Letters, 110(4):148-152, 2010.
29. Michael L. Fredman. On computing the length of longest increasing subsequences. Discrete Mathematics, 11(1):29-35, 1975. URL: http://dx.doi.org/10.1016/0012-365X(75)90103-X.
30. Kimmo Fredriksson and Szymon Grabowski. Practical and optimal string matching. In SPIRE, volume 3772, pages 376-387. Springer, 2005.
31. Kimmo Fredriksson and Szymon Grabowski. Efficient algorithms for pattern matching with general gaps, character classes, and transposition invariance. Information Retrieval, 11(4):335-357, 2008.
32. Xianping Ge. Pattern matching in financial time series data. final project report for ICS, 278, 1998.
33. Rui Henriques. Learning from High-Dimensional Data using Local Descriptive Models. PhD thesis, Instituto Superior Tecnico, Universidade de Lisboa, Lisboa, 2016.
34. Rui Henriques, Cláudia Antunes, and SaraC. Madeira. Methods for the efficient discovery of large item-indexable sequential patterns. In New Frontiers in Mining Complex Patterns, volume 8399 of LNCS, pages 100-116. Springer International Publishing, 2014.
35. Rui Henriques and Sara C Madeira. Bicspam: flexible biclustering using sequential patterns. BMC bioinformatics, 15(1):130, 2014.
36. Rui Henriques and Ana Paiva. Seven principles to mine flexible behavior from physiological signals for effective emotion recognition and description in affective interactions. In PhyCS, pages 75-82, 2014.
37. Jan Holub, William F. Smyth, and Shu Wang. Fast pattern-matching on indeterminate strings. J. Discrete Algorithms, 6(1):37-50, 2008. URL: http://dx.doi.org/10.1016/j.jda.2006.10.003.
38. Shuichi Kawashima and Minoru Kanehisa. Aaindex: amino acid index database. Nucleic acids research, 28(1):374-374, 2000.
39. Jinil Kim, Peter Eades, Rudolf Fleischer, Seok-Hee Hong, Costas S Iliopoulos, Kunsoo Park, Simon J Puglisi, and Takeshi Tokuyama. Order-preserving matching. Theoretical Computer Science, 525:68-79, 2014.
40. Donald E Knuth, James H Morris, Jr, and Vaughan R Pratt. Fast pattern matching in strings. SIAM journal on computing, 6(2):323-350, 1977.
41. Marcin Kubica, Tomasz Kulczyński, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. A linear time algorithm for consecutive permutation pattern matching. Information Processing Letters, 113(12):430-433, 2013.
42. Inbok Lee, Raphaël Clifford, and Sung-Ryul Kim. Algorithms on extended (δ, γ)-matching. Computational Science and Its Applications-ICCSA 2006, pages 1137-1142, 2006.
43. Inbok Lee, Juan Mendivelso, and Yoan J Pinzón. δγ-parameterized matching. In International Symposium on String Processing and Information Retrieval, pages 236-248. Springer, 2008.
44. Ohad Lipsky and Ely Porat. Approximate matching in the l_infinity metric. Inf. Process. Lett., 105(4):138-140, 2008. URL: http://dx.doi.org/10.1016/j.ipl.2007.08.012.
45. Juan Mendivelso, Inbok Lee, and Yoan J Pinzón. Approximate function matching under δ-and γ-distances. In SPIRE, pages 348-359. Springer, 2012.
46. S Muthukrishnan. New results and open problems related to non-standard stringology. In Combinatorial Pattern Matching, pages 298-317. Springer, 1995.
47. Ely Porat and Klim Efremenko. Approximating general metric distances between a pattern and a text. In ACM-SIAM symposium on Discrete algorithms, pages 419-427. SIAM, 2008.
X

Feedback for Dagstuhl Publishing