PalFM-Index: FM-Index for Palindrome Pattern Matching

Nagashita, Shinya; I, Tomohiro

doi:10.4230/LIPIcs.CPM.2023.23

File

Subject Classification

ACM Subject Classification

Theory of computation → Pattern matching

Keywords

Palindrome matching
Generalized string pattern matching
Indexing

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

Abstract

The palindrome pattern matching (pal-matching) is a kind of generalized pattern matching, in which two strings x and y of same length are considered to match (pal-match) if they have the same palindromic structures, i.e., for any possible 1 ≤ i < j ≤ |x| = |y|, x[i..j] is a palindrome if and only if y[i..j] is a palindrome. The pal-matching problem is the problem of searching for, in a text, the occurrences of the substrings that pal-match with a pattern. Given a text T of length n over an alphabet of size σ, an index for pal-matching is to support, given a pattern P of length m, the counting queries that compute the number occ of occurrences of P and the locating queries that compute the occurrences of P. The authors in [I et al., Theor. Comput. Sci., 2013] proposed an O(n lg n)-bit data structure to support the counting queries in O(m lg σ) time and the locating queries in O(m lg σ + occ) time. In this paper, we propose an FM-index type index for the pal-matching problem, which we call the PalFM-index, that occupies 2n lg min(σ, lg n) + 2n + o(n) bits of space and supports the counting queries in O(m) time. The PalFM-indexes can support the locating queries in O(m + Δ occ) time by adding n/Δ lg n + n + o(n) bits of space, where Δ is a parameter chosen from {1, 2, … , n} in the preprocessing phase.

Cite As Get BibTex

Shinya Nagashita and Tomohiro I. PalFM-Index: FM-Index for Palindrome Pattern Matching. In 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 259, pp. 23:1-23:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023) https://doi.org/10.4230/LIPIcs.CPM.2023.23

Author Details

Shinya Nagashita

Kyushu Institute of Technology, Fukuoka, Japan

Tomohiro I

Kyushu Institute of Technology, Fukuoka, Japan

References

Jean-Paul Allouche, Michael Baake, Julien Cassaigne, and David Damanik. Palindrome complexity. Theor. Comput. Sci., 292(1):9-31, 2003.
Mira-Cristiana Anisiu, Valeriu Anisiu, and Zoltán Kása. Total palindrome complexity of finite words. Discrete Mathematics, 310(1):109-114, 2010. URL: https://doi.org/10.1016/j.disc.2009.08.002.
Kirill Borozdin, Dmitry Kosolobov, Mikhail Rubinchik, and Arseny M. Shur. Palindromic length in linear time. In Proc. 28th Annual Symposium on Combinatorial Pattern Matching (CPM) 2017, pages 23:1-23:12, 2017. URL: https://doi.org/10.4230/LIPIcs.CPM.2017.23.
Srecko Brlek, Sylvie Hamel, Maurice Nivat, and Christophe Reutenauer. On the palindromic complexity of infinite words. Int. J. Found. Comput. Sci., 15(2):293-306, 2004. URL: https://doi.org/10.1142/S012905410400242X.
Michael Burrows and David J Wheeler. A block-sorting lossless data compression algorithm. Technical report, HP Labs, 1994.
Xavier Droubay, Jacques Justin, and Giuseppe Pirillo. Episturmian words and some constructions of de luca and rauzy. Theor. Comput. Sci., 255(1-2):539-553, 2001. URL: https://doi.org/10.1016/S0304-3975(99)00320-5.
Paolo Ferragina and Giovanni Manzini. Opportunistic data structures with applications. In FOCS, pages 390-398, 2000.
Paolo Ferragina, Giovanni Manzini, Veli Mäkinen, and Gonzalo Navarro. Compressed representations of sequences and full-text indexes. ACM Trans. Algorithms, 3(2), 2007.
Gabriele Fici, Travis Gagie, Juha Kärkkäinen, and Dominik Kempa. A subquadratic algorithm for minimum palindromic factorization. Journal of Discrete Algorithms, 28:41-48, 2014. StringMasters 2012 & 2013 Special Issue (Volume 1). URL: https://doi.org/10.1016/j.jda.2014.08.001.
Johannes Fischer and Volker Heun. Space-efficient preprocessing schemes for range minimum queries on static arrays. SIAM J. Comput., 40(2):465-492, 2011.
Travis Gagie, Giovanni Manzini, and Rossano Venturini. An encoding for order-preserving matching. In Proc. 25th Annual European Symposium on Algorithms (ESA) 2017, pages 38:1-38:15, 2017. URL: https://doi.org/10.4230/LIPIcs.ESA.2017.38.
Zvi Galil and Joel I. Seiferas. A linear-time on-line recognition algorithm for "palstar". J. ACM, 25(1):102-111, 1978. URL: https://doi.org/10.1145/322047.322056.
Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan. pBWT: Achieving succinct data structures for parameterized pattern matching and related problems. In Proc. 28th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) 2017, pages 397-407, 2017. URL: https://doi.org/10.1137/1.9781611974782.25.
Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan. Structural pattern matching - succinctly. In Proc. 28th International Symposium on Algorithms and Computation (ISAAC) 2017, pages 35:1-35:13, 2017. URL: https://doi.org/10.4230/LIPIcs.ISAAC.2017.35.
Amy Glen, Jacques Justin, Steve Widmer, and Luca Q. Zamboni. Palindromic richness. Eur. J. Comb., 30(2):510-531, 2009. URL: https://doi.org/10.1016/j.ejc.2008.04.006.
Alexander Golynski, Rajeev Raman, and S. Srinivasa Rao. On the redundancy of succinct data structures. In Joachim Gudmundsson, editor, Proc. 11th Scandinavian Workshop on Algorithm Theory (SWAT) 2008, volume 5124 of Lecture Notes in Computer Science, pages 148-159. Springer, 2008.
Roberto Grossi, Ankur Gupta, and Jeffrey Scott Vitter. High-order entropy-compressed text indexes. In Proc. 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) 2003, pages 841-850. ACM/SIAM, 2003.
Tomohiro I, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Counting and verifying maximal palindromes. In Proc. 17th International Symposium on String Processing and Information Retrieval (SPIRE) 2010, pages 135-146, 2010.
Tomohiro I, Shunsuke Inenaga, and Masayuki Takeda. Palindrome pattern matching. Theor. Comput. Sci., 483:162-170, 2013. URL: https://doi.org/10.1016/j.tcs.2012.01.047.
Tomohiro I, Shiho Sugimoto, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing palindromic factorizations and palindromic covers on-line. In Proc. 25th Annual Symposium on Combinatorial Pattern Matching (CPM) 2014, volume 8486 of Lecture Notes in Computer Science, pages 150-161. Springer, 2014.
Ignacio Tinoco Jr., Olke C. Uhlenbeck, and Mark D. Levine. Estimation of secondary structure in ribonucleic acids. Nature, 230:362-367, 1971.
Sung-Hwan Kim and Hwan-Gue Cho. A compact index for cartesian tree matching. In Pawel Gawrychowski and Tatiana Starikovskaya, editors, Proc. 32nd Annual Symposium on Combinatorial Pattern Matching (CPM) 2021, volume 191 of LIPIcs, pages 18:1-18:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
Sung-Hwan Kim and Hwan-Gue Cho. Simpler FM-index for parameterized string matching. Inf. Process. Lett., 165:106026, 2021. URL: https://doi.org/10.1016/j.ipl.2020.106026.
Donald E. Knuth, James H. Morris, and Vaughan R. Pratt. Fast pattern matching in strings. SIAM J. Comput., 6(2):323-350, 1977.
Dmitry Kosolobov, Mikhail Rubinchik, and Arseny M. Shur. Pal k is linear recognizable online. In SOFSEM 2015: Theory and Practice of Computer Science - 41st International Conference on Current Trends in Theory and Practice of Computer Science, Pec pod Sněžkou, Czech Republic, January 24-29, 2015. Proceedings, pages 289-301, 2015. URL: https://doi.org/10.1007/978-3-662-46078-8_24.
Glenn K. Manacher. A new linear-time "on-line" algorithm for finding the smallest initial palindrome of a string. J. ACM, 22(3):346-351, 1975. URL: https://doi.org/10.1145/321892.321896.
Yoshiaki Matsuoka, Takahiro Aoki, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Generalized pattern matching and periodicity under substring consistent equivalence relations. Theor. Comput. Sci., 656:225-233, 2016.
Antonio Restivo and Giovanna Rosone. Burrows-wheeler transform and palindromic richness. Theor. Comput. Sci., 410(30-32):3018-3026, 2009. URL: https://doi.org/10.1016/j.tcs.2009.03.008.
Mikhail Rubinchik and Arseny M. Shur. EERTREE: an efficient data structure for processing palindromes in strings. Eur. J. Comb., 68:249-265, 2018. URL: https://doi.org/10.1016/j.ejc.2017.07.021.

PalFM-Index: FM-Index for Palindrome Pattern Matching

Authors Shinya Nagashita, Tomohiro I

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

PalFM-Index: FM-Index for Palindrome Pattern Matching

Authors Shinya Nagashita, Tomohiro I

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

References

Thanks for your feedback!

Could not send message