Longest Palindromic Substring in Sublinear Time

Authors Panagiotis Charalampopoulos , Solon P. Pissis , Jakub Radoszewski



PDF
Thumbnail PDF

File

LIPIcs.CPM.2022.20.pdf
  • Filesize: 0.8 MB
  • 9 pages

Document Identifiers

Author Details

Panagiotis Charalampopoulos
  • Reichman University, Herzliya, Israel
Solon P. Pissis
  • CWI, Amsterdam, The Netherlands
  • Vrije Universiteit, Amsterdam, The Netherlands
Jakub Radoszewski
  • Institute of Informatics, University of Warsaw, Poland

Cite As Get BibTex

Panagiotis Charalampopoulos, Solon P. Pissis, and Jakub Radoszewski. Longest Palindromic Substring in Sublinear Time. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 20:1-20:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022) https://doi.org/10.4230/LIPIcs.CPM.2022.20

Abstract

We revisit the classic algorithmic problem of computing a longest palidromic substring. This problem is solvable by a celebrated 𝒪(n)-time algorithm [Manacher, J. ACM 1975], where n is the length of the input string. For small alphabets, 𝒪(n) is not necessarily optimal in the word RAM model of computation: a string of length n over alphabet [0,σ) can be stored in 𝒪(n log σ/log n) space and read in 𝒪(n log σ/log n) time. We devise a simple 𝒪(n log σ/log n)-time algorithm for computing a longest palindromic substring. In particular, our algorithm works in sublinear time if σ = 2^{o(log n)}. Our technique relies on periodicity and on the 𝒪(n log σ/log n)-time constructible data structure of Kempa and Kociumaka [STOC 2019] that answers longest common extension queries in 𝒪(1) time.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • string algorithms
  • longest palindromic substring
  • longest common extension

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amihood Amir and Itai Boneh. Dynamic palindrome detection. CoRR, abs/1906.09732, 2019. URL: http://arxiv.org/abs/1906.09732.
  2. Amihood Amir, Panagiotis Charalampopoulos, Solon P. Pissis, and Jakub Radoszewski. Dynamic and internal longest common substring. Algorithmica, 82(12):3707-3743, 2020. URL: https://doi.org/10.1007/s00453-020-00744-0.
  3. Alberto Apostolico, Dany Breslauer, and Zvi Galil. Parallel detection of all palindromes in a string. Theoretical Computer Science, 141(1):163-173, 1995. URL: https://doi.org/10.1016/0304-3975(94)00083-U.
  4. Ricardo A. Baeza-Yates. Improved string searching. Software: Practice and Experience, 19(3):257-271, 1989. URL: https://doi.org/10.1002/spe.4380190305.
  5. Hideo Bannai, Travis Gagie, Shunsuke Inenaga, Juha Kärkkäinen, Dominik Kempa, Marcin Piatkowski, and Shiho Sugimoto. Diverse palindromic factorization is NP-complete. International Journal of Foundations of Computer Science, 29(2):143-164, 2018. URL: https://doi.org/10.1142/S0129054118400014.
  6. Djamal Belazzougui. Worst case efficient single and multiple string matching in the RAM model. In Combinatorial Algorithms - 21st International Workshop, IWOCA 2010, volume 6460 of Lecture Notes in Computer Science, pages 90-102. Springer, 2010. URL: https://doi.org/10.1007/978-3-642-19222-7_10.
  7. Oren Ben-Kiki, Philip Bille, Dany Breslauer, Leszek Gasieniec, Roberto Grossi, and Oren Weimann. Optimal packed string matching. In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2011, volume 13 of LIPIcs, pages 423-432. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2011. URL: https://doi.org/10.4230/LIPIcs.FSTTCS.2011.423.
  8. Oren Ben-Kiki, Philip Bille, Dany Breslauer, Leszek Gasieniec, Roberto Grossi, and Oren Weimann. Towards optimal packed string matching. Theoretical Computer Science, 525:111-129, 2014. URL: https://doi.org/10.1016/j.tcs.2013.06.013.
  9. Philip Bille. Fast searching in packed strings. In Combinatorial Pattern Matching, 20th Annual Symposium, CPM 2009, volume 5577 of Lecture Notes in Computer Science, pages 116-126. Springer, 2009. URL: https://doi.org/10.1007/978-3-642-02441-2_11.
  10. Philip Bille. Fast searching in packed strings. Journal of Discrete Algorithms, 9(1):49-56, 2011. URL: https://doi.org/10.1016/j.jda.2010.09.003.
  11. Philip Bille, Inge Li Gørtz, and Frederik Rye Skjoldjensen. Deterministic indexing for packed strings. In 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, volume 78 of LIPIcs, pages 6:1-6:11. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017. URL: https://doi.org/10.4230/LIPIcs.CPM.2017.6.
  12. Kirill Borozdin, Dmitry Kosolobov, Mikhail Rubinchik, and Arseny M. Shur. Palindromic length in linear time. In 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, volume 78 of LIPIcs, pages 23:1-23:12. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017. URL: https://doi.org/10.4230/LIPIcs.CPM.2017.23.
  13. Václav Brázda, Martin Bartas, Jiří Lỳsek, Jan Coufal, and Miroslav Fojta. Global analysis of inverted repeat sequences in human gene promoters reveals their non-random distribution and association with specific biological pathways. Genomics, 112(4):2772-2777, 2020. URL: https://doi.org/10.1016/j.ygeno.2020.03.014.
  14. Dany Breslauer, Leszek Gasieniec, and Roberto Grossi. Constant-time word-size string matching. In Combinatorial Pattern Matching - 23rd Annual Symposium, CPM 2012, pages 83-96, 2012. URL: https://doi.org/10.1007/978-3-642-31265-6_7.
  15. Panagiotis Charalampopoulos, Tomasz Kociumaka, Solon P. Pissis, and Jakub Radoszewski. Faster algorithms for longest common substring. In 29th Annual European Symposium on Algorithms, ESA 2021, volume 204 of LIPIcs, pages 30:1-30:17. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021. URL: https://doi.org/10.4230/LIPIcs.ESA.2021.30.
  16. Maxime Crochemore, Christophe Hancart, and Thierry Lecroq. Algorithms on strings. Cambridge University Press, 2007. Google Scholar
  17. Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Wojciech Tyczyński, and Tomasz Waleń. The maximum number of squares in a tree. In Combinatorial Pattern Matching - 23rd Annual Symposium, CPM 2012, pages 27-40, 2012. URL: https://doi.org/10.1007/978-3-642-31265-6_3.
  18. Michaela Čutová, Jacinta Manta, Otília Porubiaková, Patrik Kaura, Jiří Št'astnỳ, Eva B. Jagelská, Pratik Goswami, Martin Bartas, and Václav Brázda. Divergent distributions of inverted repeats and G-quadruplex forming sequences in Saccharomyces cerevisiae. Genomics, 112(2):1897-1901, 2020. URL: https://doi.org/10.1016/j.ygeno.2019.11.002.
  19. Gabriele Fici, Travis Gagie, Juha Kärkkäinen, and Dominik Kempa. A subquadratic algorithm for minimum palindromic factorization. Journal of Discrete Algorithms, 28:41-48, 2014. URL: https://doi.org/10.1016/j.jda.2014.08.001.
  20. Nathan J. Fine and Herbert S. Wilf. Uniqueness theorems for periodic functions. Proceedings of the American Mathematical Society, 16(1):109-114, 1965. URL: http://www.jstor.org/stable/2034009.
  21. Kimmo Fredriksson. Shift-or string matching with super-alphabets. Information Processing Letters, 87(4):201-204, 2003. URL: https://doi.org/10.1016/S0020-0190(03)00296-5.
  22. Mitsuru Funakoshi and Takuya Mieno. Minimal unique palindromic substrings after single-character substitution. In String Processing and Information Retrieval - 28th International Symposium, SPIRE 2021, pages 33-46, 2021. URL: https://doi.org/10.1007/978-3-030-86692-1_4.
  23. Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing longest palindromic substring after single-character or block-wise edits. Theoretical Computer Science, 859:116-133, 2021. URL: https://doi.org/10.1016/j.tcs.2021.01.014.
  24. François Le Gall and Saeed Seddighin. Quantum meets fine-grained complexity: Sublinear time quantum algorithms for string problems. In 13th Innovations in Theoretical Computer Science Conference, ITCS 2022, volume 215 of LIPIcs, pages 97:1-97:23. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. URL: https://doi.org/10.4230/LIPIcs.ITCS.2022.97.
  25. Pawel Gawrychowski, Tomohiro I, Shunsuke Inenaga, Dominik Köppl, and Florin Manea. Tighter bounds and optimal algorithms for all maximal α-gapped repeats and palindromes - finding all maximal α-gapped repeats and palindromes in optimal worst case time on integer alphabets. Theory of Computing Systems, 62(1):162-191, 2018. URL: https://doi.org/10.1007/s00224-017-9794-5.
  26. Pawel Gawrychowski, Oleg Merkurev, Arseny M. Shur, and Przemyslaw Uznański. Tight tradeoffs for real-time approximation of longest palindromes in streams. Algorithmica, 81(9):3630-3654, 2019. URL: https://doi.org/10.1007/s00453-019-00591-8.
  27. Emanuele Giaquinta, Szymon Grabowski, and Kimmo Fredriksson. Approximate pattern matching with k-mismatches in packed text. Information Processing Letters, 113(19-21):693-697, 2013. URL: https://doi.org/10.1016/j.ipl.2013.07.002.
  28. Dan Gusfield. Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology. Cambridge University Press, 1997. URL: https://doi.org/10.1017/cbo9780511574931.
  29. Tomohiro I, Shunsuke Inenaga, and Masayuki Takeda. Palindrome pattern matching. Theoretical Computer Science, 483:162-170, 2013. URL: https://doi.org/10.1016/j.tcs.2012.01.047.
  30. Tomohiro I, Shiho Sugimoto, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing palindromic factorizations and palindromic covers on-line. In Combinatorial Pattern Matching - 25th Annual Symposium, CPM 2014, volume 8486 of Lecture Notes in Computer Science, pages 150-161. Springer, 2014. URL: https://doi.org/10.1007/978-3-319-07566-2_16.
  31. Johan Jeuring. The derivation of on-line algorithms, with an application to finding palindromes. Algorithmica, 11(2):146-184, 1994. URL: https://doi.org/10.1007/BF01182773.
  32. Dominik Kempa and Tomasz Kociumaka. String synchronizing sets: sublinear-time BWT construction and optimal LCE data structure. In 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, pages 756-767. ACM, 2019. URL: https://doi.org/10.1145/3313276.3316368.
  33. Gad M. Landau and Uzi Vishkin. Efficient string matching with k mismatches. Theoretical Computer Science, 43:239-249, 1986. URL: https://doi.org/10.1016/0304-3975(86)90178-7.
  34. Glenn K. Manacher. A new linear-time "on-line" algorithm for finding the smallest initial palindrome of a string. Journal of the ACM, 22(3):346-351, 1975. URL: https://doi.org/10.1145/321892.321896.
  35. Fernando Martínez-Alberola, Eva Barreno, Leonardo M Casano, Francisco Gasulla, Arantzazu Molins, Patricia Moya, María González-Hourcade, and Eva M. Del Campo. The chloroplast genome of the lichen-symbiont microalga Trebouxia sp. Tr9 (Trebouxiophyceae, Chlorophyta) shows short inverted repeats with a single gene and loss of the rps4 gene, which is encoded by the nucleus. Journal of Phycology, 56(1):170-184, 2020. URL: https://doi.org/10.1111/jpy.12928.
  36. Wataru Matsubara, Shunsuke Inenaga, Akira Ishino, Ayumi Shinohara, Tomoyuki Nakamura, and Kazuo Hashimoto. Efficient algorithms to compute compressed longest common substrings and compressed palindromes. Theoretical Computer Science, 410(8-10):900-913, 2009. URL: https://doi.org/10.1016/j.tcs.2008.12.016.
  37. Gonzalo Navarro and Mathieu Raffinot. A bit-parallel approach to suffix automata: Fast extended string matching. In Combinatorial Pattern Matching, 9th Annual Symposium, CPM 1998, volume 1448 of Lecture Notes in Computer Science, pages 14-33. Springer, 1998. URL: https://doi.org/10.1007/BFb0030778.
  38. Christopher E. Pearson, Haralabos Zorbas, Gerald B. Price, and Maria Zannis-Hadjopoulos. Inverted repeats, stem-loops, and cruciforms: significance for initiation of DNA replication. Journal of cellular biochemistry, 63(1):1-22, 1996. URL: https://doi.org/10.1002/(SICI)1097-4644(199610)63:1<1::AID-JCB1>3.0.CO;2-3.
  39. Mikhail Rubinchik and Arseny M. Shur. Counting palindromes in substrings. In String Processing and Information Retrieval - 24th International Symposium, SPIRE 2017, volume 10508 of Lecture Notes in Computer Science, pages 290-303. Springer, 2017. URL: https://doi.org/10.1007/978-3-319-67428-5_25.
  40. Mikhail Rubinchik and Arseny M. Shur. EERTREE: an efficient data structure for processing palindromes in strings. European Journal of Combinatorics, 68:249-265, 2018. URL: https://doi.org/10.1016/j.ejc.2017.07.021.
  41. Mikhail Rubinchik and Arseny M. Shur. Palindromic k-factorization in pure linear time. In 45th International Symposium on Mathematical Foundations of Computer Science, MFCS 2020, volume 170 of LIPIcs, pages 81:1-81:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020. URL: https://doi.org/10.4230/LIPIcs.MFCS.2020.81.
  42. Xin Tao, Shaochun Yuan, Fan Chen, Xiaoman Gao, Xinli Wang, Wenjuan Yu, Song Liu, Ziwen Huang, Shangwu Chen, and Anlong Xu. Functional requirement of terminal inverted repeats for efficient ProtoRAG activity reveals the early evolution of V(D)J recombination. National Science Review, 7(2):403-417, 2020. URL: https://doi.org/10.1093/nsr/nwz179.
  43. Ran Zhou, David Macaya-Sanz, Craig H. Carlson, Jeremy Schmutz, Jerry W. Jenkins, David Kudrna, Aditi Sharma, Laura Sandor, Shengqiang Shu, Kerrie Barry, et al. A willow sex chromosome reveals convergent evolution of complex palindromic repeats. Genome Biology, 21(1):1-19, 2020. URL: https://doi.org/10.1186/s13059-020-1952-4.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail