Dynamic Longest Common Substring in Polylogarithmic Time

Authors Panagiotis Charalampopoulos , Paweł Gawrychowski , Karol Pokorski



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2020.27.pdf
  • Filesize: 0.66 MB
  • 19 pages

Document Identifiers

Author Details

Panagiotis Charalampopoulos
  • Department of Informatics, King’s College London, UK
  • Institute of Informatics, University of Warsaw, Poland
Paweł Gawrychowski
  • Institute of Computer Science, University of Wrocław, Poland
Karol Pokorski
  • Institute of Computer Science, University of Wrocław, Poland

Cite AsGet BibTex

Panagiotis Charalampopoulos, Paweł Gawrychowski, and Karol Pokorski. Dynamic Longest Common Substring in Polylogarithmic Time. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 27:1-27:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.ICALP.2020.27

Abstract

The longest common substring problem consists in finding a longest string that appears as a (contiguous) substring of two input strings. We consider the dynamic variant of this problem, in which we are to maintain two dynamic strings S and T, each of length at most n, that undergo substitutions of letters, in order to be able to return a longest common substring after each substitution. Recently, Amir et al. [ESA 2019] presented a solution for this problem that needs only 𝒪̃(n^(2/3)) time per update. This brought the challenge of determining whether there exists a faster solution with polylogarithmic update time, or (as is the case for other dynamic problems), we should expect a polynomial (conditional) lower bound. We answer this question by designing a significantly faster algorithm that processes each substitution in amortized log^𝒪(1) n time with high probability. Our solution relies on exploiting the local consistency of the parsing of a collection of dynamic strings due to Gawrychowski et al. [SODA 2018], and on maintaining two dynamic trees with labeled bicolored leaves, so that after each update we can report a pair of nodes, one from each tree, of maximum combined weight, which have at least one common leaf-descendant of each color. We complement this with a lower bound of Ω(log n/ log log n) for the update time of any polynomial-size data structure that maintains the LCS of two dynamic strings, even allowing amortization and randomization.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • string algorithms
  • dynamic algorithms
  • longest common substring

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Paniz Abedin, Sahar Hooshmand, Arnab Ganguly, and Sharma V. Thankachan. The heaviest induced ancestors problem revisited. In 29th CPM, pages 20:1-20:13, 2018. URL: https://doi.org/10.4230/LIPIcs.CPM.2018.20.
  2. Stephen Alstrup, Gerth Stølting Brodal, and Theis Rauhe. Pattern matching in dynamic texts. In 11th SODA, pages 819-828, 2000. URL: http://dl.acm.org/citation.cfm?id=338219.338645.
  3. Amihood Amir and Itai Boneh. Locally maximal common factors as a tool for efficient dynamic string algorithms. In 29th CPM, pages 11:1-11:13, 2018. URL: https://doi.org/10.4230/LIPIcs.CPM.2018.11.
  4. Amihood Amir and Itai Boneh. Dynamic palindrome detection. CoRR, abs/1906.09732, 2019. URL: http://arxiv.org/abs/1906.09732.
  5. Amihood Amir, Itai Boneh, Panagiotis Charalampopoulos, and Eitan Kondratovsky. Repetition Detection in a Dynamic String. In 27th ESA, pages 5:1-5:18, 2019. URL: https://doi.org/10.4230/LIPIcs.ESA.2019.5.
  6. Amihood Amir, Panagiotis Charalampopoulos, Costas S. Iliopoulos, Solon P. Pissis, and Jakub Radoszewski. Longest common factor after one edit operation. In 24th SPIRE, pages 14-26, 2017. URL: https://doi.org/10.1007/978-3-319-67428-5_2.
  7. Amihood Amir, Panagiotis Charalampopoulos, Solon P. Pissis, and Jakub Radoszewski. Longest common substring made fully dynamic. In 27th ESA, pages 6:1-6:17, 2019. URL: https://doi.org/10.4230/LIPIcs.ESA.2019.6.
  8. Amihood Amir, Gad M. Landau, Moshe Lewenstein, and Dina Sokol. Dynamic text and static pattern matching. ACM Trans. Algorithms, 3(2):19, 2007. URL: https://doi.org/10.1145/1240233.1240242.
  9. Panagiotis Charalampopoulos, Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Linear-time algorithm for long LCF with k mismatches. In 29th CPM, pages 23:1-23:16, 2018. URL: https://doi.org/10.4230/LIPIcs.CPM.2018.23.
  10. Richard Cole, Lee-Ad Gottlieb, and Moshe Lewenstein. Dictionary matching and indexing with errors and don't cares. In 36th STOC, pages 91-100, 2004. URL: https://doi.org/10.1145/1007352.1007374.
  11. Martin Farach. Optimal suffix tree construction with large alphabets. In 38th FOCS, pages 137-143, 1997. URL: https://doi.org/10.1109/SFCS.1997.646102.
  12. Martin Farach and S. Muthukrishnan. Perfect hashing for strings: Formalization and algorithms. In 7th CPM, pages 130-140, 1996. URL: https://doi.org/10.1007/3-540-61258-0_11.
  13. Harold N. Gabow. Data structures for weighted matching and nearest common ancestors with linking. In 1st SODA, pages 434-443, 1990. URL: http://dl.acm.org/citation.cfm?id=320176.320229.
  14. Travis Gagie, Paweł Gawrychowski, and Yakov Nekrich. Heaviest induced ancestors and longest common substrings. In 25th CCCG, 2013. URL: http://cccg.ca/proceedings/2013/papers/paper_29.pdf.
  15. Paweł Gawrychowski, Adam Karczmarz, Tomasz Kociumaka, Jakub Lacki, and Piotr Sankowski. Optimal dynamic strings. In 29th SODA, pages 1509-1528, 2018. URL: https://doi.org/10.1137/1.9781611975031.99.
  16. Monika Henzinger, Sebastian Krinninger, Danupon Nanongkai, and Thatchaphol Saranurak. Unifying and strengthening hardness for dynamic problems via the online matrix-vector multiplication conjecture. In 47th STOC, pages 21-30, 2015. URL: https://doi.org/10.1145/2746539.2746609.
  17. Tomohiro I. Longest common extensions with recompression. In 28th CPM, pages 18:1-18:15, 2017. URL: https://doi.org/10.4230/LIPIcs.CPM.2017.18.
  18. Artur Jeż. Faster fully compressed pattern matching by recompression. ACM Transactions on Algorithms, 11(3):20:1-20:43, 2015. URL: https://doi.org/10.1145/2631920.
  19. Artur Jeż. Recompression: A simple and powerful technique for word equations. J. ACM, 63(1):4:1-4:51, 2016. URL: https://doi.org/10.1145/2743014.
  20. Tomasz Kociumaka, Jakub Radoszewski, and Tatiana A. Starikovskaya. Longest common substring with approximately k mismatches. Algorithmica, 81(6):2633-2652, 2019. URL: https://doi.org/10.1007/s00453-019-00548-x.
  21. Tomasz Kociumaka, Tatiana A. Starikovskaya, and Hjalte Wedel Vildhøj. Sublinear space algorithms for the longest common substring problem. In 22nd ESA, pages 605-617, 2014. URL: https://doi.org/10.1007/978-3-662-44777-2_50.
  22. Yakov Nekrich. A data structure for multi-dimensional range reporting. In 23rd SOCG, pages 344-353, 2007. URL: https://doi.org/10.1145/1247069.1247130.
  23. Mihai Patrascu. Unifying the landscape of cell-probe lower bounds. SIAM J. Comput., 40(3):827-847, 2011. URL: https://doi.org/10.1137/09075336X.
  24. Mihai Pǎtraşcu and Erik D. Demaine. Logarithmic lower bounds in the cell-probe model. SIAM J. Comput., 35(4):932-963, 2006. URL: https://doi.org/10.1137/S0097539705447256.
  25. Mihai Pǎtraşcu and Mikkel Thorup. Time-space trade-offs for predecessor search. In 38th STOC, pages 232-240, 2006. URL: https://doi.org/10.1145/1132516.1132551.
  26. S. Sahinalp and U. Vishkin. Efficient approximate and dynamic matching of patterns using a labeling paradigm. In 54th FOCS, pages 320-328, 1996. URL: https://doi.org/10.1109/SFCS.1996.548491.
  27. Tatiana A. Starikovskaya and Hjalte Wedel Vildhøj. Time-space trade-offs for the longest common substring problem. In 24th CPM, pages 223-234, 2013. URL: https://doi.org/10.1007/978-3-642-38905-4_22.
  28. Sharma V. Thankachan, Alberto Apostolico, and Srinivas Aluru. A provably efficient algorithm for the k-mismatch average common substring problem. Journal of Computational Biology, 23(6):472-482, 2016. URL: https://doi.org/10.1089/cmb.2015.0235.
  29. Peter Weiner. Linear pattern matching algorithms. In 14th FOCS, pages 1-11, 1973. URL: https://doi.org/10.1109/SWAT.1973.13.
  30. Dan E. Willard. Log-logarithmic worst-case range queries are possible in space Θ(n). Information Processing Letters, 17(2):81-84, 1983. URL: https://doi.org/10.1016/0020-0190(83)90075-3.
  31. Dan E. Willard and George S. Lueker. Adding range restriction capability to dynamic data structures. J. ACM, 32(3):597-617, 1985. URL: https://doi.org/10.1145/3828.3839.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail