The Dynamic k-Mismatch Problem

Authors Raphaël Clifford , Paweł Gawrychowski , Tomasz Kociumaka , Daniel P. Martin , Przemysław Uznański



PDF
Thumbnail PDF

File

LIPIcs.CPM.2022.18.pdf
  • Filesize: 0.79 MB
  • 15 pages

Document Identifiers

Author Details

Raphaël Clifford
  • Department of Computer Science, University of Bristol, UK
Paweł Gawrychowski
  • Institute of Computer Science, University of Wrocław, Poland
Tomasz Kociumaka
  • University of California, Berkeley, CA, USA
Daniel P. Martin
  • The Alan Turing Institute, British Library, London, UK
Przemysław Uznański
  • Institute of Computer Science, University of Wrocław, Poland

Acknowledgements

We are grateful to Ely Porat and Shay Golan for insightful conversations about the dynamic k-mismatch problem at an early stage of this work.

Cite As Get BibTex

Raphaël Clifford, Paweł Gawrychowski, Tomasz Kociumaka, Daniel P. Martin, and Przemysław Uznański. The Dynamic k-Mismatch Problem. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 18:1-18:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022) https://doi.org/10.4230/LIPIcs.CPM.2022.18

Abstract

The text-to-pattern Hamming distances problem asks to compute the Hamming distances between a given pattern of length m and all length-m substrings of a given text of length n ≥ m. We focus on the well-studied k-mismatch version of the problem, where a distance needs to be returned only if it does not exceed a threshold k. Moreover, we assume n ≤ 2m (in general, one can partition the text into overlapping blocks). In this work, we develop data structures for the dynamic version of the k-mismatch problem supporting two operations: An update performs a single-letter substitution in the pattern or the text, whereas a query, given an index i, returns the Hamming distance between the pattern and the text substring starting at position i, or reports that the distance exceeds k. 
First, we describe a simple data structure with 𝒪̃(1) update time and 𝒪̃(k) query time. Through considerably more sophisticated techniques, we show that 𝒪̃(k) update time and 𝒪̃(1) query time is also achievable. These two solutions likely provide an essentially optimal trade-off for the dynamic k-mismatch problem with m^{Ω(1)} ≤ k ≤ √m: we prove that, in that case, conditioned on the 3SUM conjecture, one cannot simultaneously achieve k^{1-Ω(1)} time for all operations (updates and queries) after n^{𝒪(1)}-time initialization. For k ≥ √m, the same lower bound excludes achieving m^{1/2-Ω(1)} time per operation. This is known to be essentially tight for constant-sized alphabets: already Clifford et al. (STACS 2018) achieved 𝒪̃(√m) time per operation in that case, but their solution for large alphabets costs 𝒪̃(m^{3/4}) time per operation. We improve and extend the latter result by developing a trade-off algorithm that, given a parameter 1 ≤ x ≤ k, achieves update time 𝒪̃(m/k +√{mk/x}) and query time 𝒪̃(x). In particular, for k ≥ √m, an appropriate choice of x yields 𝒪̃(∛{mk}) time per operation, which is 𝒪̃(m^{2/3}) when only the trivial threshold k = m is provided.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • Pattern matching
  • Hamming distance
  • dynamic algorithms

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Stephen Alstrup, Gerth Stølting Brodal, and Theis Rauhe. Pattern matching in dynamic texts. In SODA, pages 819-828, 2000. URL: http://dl.acm.org/citation.cfm?id=338219.338645.
  2. Amihood Amir and Itai Boneh. Update query time trade-off for dynamic suffix arrays. In Yixin Cao, Siu-Wing Cheng, and Minming Li, editors, 31st International Symposium on Algorithms and Computation, ISAAC 2020, volume 181 of LIPIcs, pages 63:1-63:16. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2020. URL: https://doi.org/10.4230/LIPIcs.ISAAC.2020.63.
  3. Amihood Amir and Itai Boneh. Dynamic suffix array with sub-linear update time and poly-logarithmic lookup time. CoRR, 2021. URL: http://arxiv.org/abs/2112.12678.
  4. Amihood Amir, Itai Boneh, Panagiotis Charalampopoulos, and Eitan Kondratovsky. Repetition detection in a dynamic string. In ESA, volume 144 of LIPIcs, pages 5:1-5:18, 2019. URL: https://doi.org/10.4230/LIPIcs.ESA.2019.5.
  5. Amihood Amir, Panagiotis Charalampopoulos, Solon P. Pissis, and Jakub Radoszewski. Longest common substring made fully dynamic. In ESA, volume 144 of LIPIcs, pages 6:1-6:17, 2019. URL: https://doi.org/10.4230/LIPIcs.ESA.2019.6.
  6. Amihood Amir, Moshe Lewenstein, and Ely Porat. Faster algorithms for string matching with k mismatches. Journal of Algorithms, 50(2):257-275, 2004. URL: https://doi.org/10.1016/S0196-6774(03)00097-X.
  7. Ilya Baran, Erik D. Demaine, and Mihai Patrascu. Subquadratic algorithms for 3sum. Algorithmica, 50(4):584-596, 2008. URL: https://doi.org/10.1007/s00453-007-9036-3.
  8. Timothy M. Chan. More logarithmic-factor speedups for 3SUM, (median, +)-convolution, and some geometric 3SUM-hard problems. ACM Transactions on Algorithms, 16(1):7:1-7:23, 2020. URL: https://doi.org/10.1145/3363541.
  9. Timothy M. Chan, Shay Golan, Tomasz Kociumaka, Tsvi Kopelowitz, and Ely Porat. Approximating text-to-pattern Hamming distances. In STOC, pages 643-656. ACM, 2020. URL: https://doi.org/10.1145/3357713.3384266.
  10. Panagiotis Charalampopoulos, Paweł Gawrychowski, and Karol Pokorski. Dynamic longest common substring in polylogarithmic time. In ICALP, volume 168 of LIPIcs, pages 27:1-27:19, 2020. URL: https://doi.org/10.4230/LIPIcs.ICALP.2020.27.
  11. Panagiotis Charalampopoulos, Tomasz Kociumaka, and Philip Wellnitz. Faster approximate pattern matching: A unified approach. In FOCS, pages 978-989, 2020. URL: https://doi.org/10.1109/FOCS46700.2020.00095.
  12. Raphaël Clifford, Allyx Fontaine, Ely Porat, Benjamin Sach, and Tatiana Starikovskaya. The k-mismatch problem revisited. In SODA, pages 2039-2052. SIAM, 2016. URL: https://doi.org/10.1137/1.9781611974331.ch142.
  13. Raphaël Clifford, Allan Grønlund, Kasper Green Larsen, and Tatiana Starikovskaya. Upper and lower bounds for dynamic data structures on strings. In STACS, volume 96 of LIPIcs, pages 22:1-22:14, 2018. URL: https://doi.org/10.4230/LIPIcs.STACS.2018.22.
  14. Raphaël Clifford, Tomasz Kociumaka, and Ely Porat. The streaming k-mismatch problem. In SODA, pages 1106-1125. SIAM, 2019. URL: https://doi.org/10.1137/1.9781611975482.68.
  15. Paweł Gawrychowski, Adam Karczmarz, Tomasz Kociumaka, Jakub Łącki, and Piotr Sankowski. Optimal dynamic strings. In SODA, pages 1509-1528. SIAM, 2018. URL: https://doi.org/10.1137/1.9781611975031.99.
  16. Paweł Gawrychowski and Przemysław Uznański. Towards unified approximate pattern matching for Hamming and L₁ distance. In ICALP, volume 107 of LIPIcs, pages 62:1-62:13, 2018. URL: https://doi.org/10.4230/LIPIcs.ICALP.2018.62.
  17. Shay Golan, Tomasz Kociumaka, Tsvi Kopelowitz, and Ely Porat. The streaming k-mismatch problem: Tradeoffs between space and total time. In CPM, volume 161 of LIPIcs, pages 15:1-15:15, 2020. URL: https://doi.org/10.4230/LIPIcs.CPM.2020.15.
  18. Monika Henzinger, Sebastian Krinninger, Danupon Nanongkai, and Thatchaphol Saranurak. Unifying and strengthening hardness for dynamic problems via the online matrix-vector multiplication conjecture. In STOC, pages 21-30. ACM, 2015. URL: https://doi.org/10.1145/2746539.2746609.
  19. Chloe Ching-Yun Hsu and Chris Umans. On multidimensional and monotone k-sum. In Kim G. Larsen, Hans L. Bodlaender, and Jean-François Raskin, editors, 42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017, volume 83 of LIPIcs, pages 50:1-50:13. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2017. URL: https://doi.org/10.4230/LIPIcs.MFCS.2017.50.
  20. Dominik Kempa and Tomasz Kociumaka. Dynamic suffix array with polylogarithmic queries and updates. CoRR, 2022. URL: http://arxiv.org/abs/2201.01285.
  21. Gad M. Landau and Uzi Vishkin. Efficient string matching with k mismatches. Theoretical Computer Science, 43:239-249, 1986. URL: https://doi.org/10.1016/0304-3975(86)90178-7.
  22. Andrea Lincoln, Virginia Vassilevska Williams, Joshua R. Wang, and R. Ryan Williams. Deterministic time-space trade-offs for k-sum. In Ioannis Chatzigiannakis, Michael Mitzenmacher, Yuval Rabani, and Davide Sangiorgi, editors, 43rd International Colloquium on Automata, Languages, and Programming, ICALP 2016, volume 55 of LIPIcs, pages 58:1-58:14. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2016. URL: https://doi.org/10.4230/LIPIcs.ICALP.2016.58.
  23. Mihai Pătraşcu. Towards polynomial lower bounds for dynamic problems. In STOC, pages 603-610. ACM, 2010. URL: https://doi.org/10.1145/1806689.1806772.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail