Approximating Dynamic Time Warping Distance Between Run-Length Encoded Strings

Authors Zoe Xi, William Kuszmaul



PDF
Thumbnail PDF

File

LIPIcs.ESA.2022.90.pdf
  • Filesize: 0.72 MB
  • 19 pages

Document Identifiers

Author Details

Zoe Xi
  • Massachusetts Institute of Technology, Cambridge, MA, USA
William Kuszmaul
  • Massachusetts Institute of Technology, Cambridge, MA, USA

Acknowledgements

The authors would like to thank Charles E. Leiserson for his helpful feedback and suggestions.

Cite AsGet BibTex

Zoe Xi and William Kuszmaul. Approximating Dynamic Time Warping Distance Between Run-Length Encoded Strings. In 30th Annual European Symposium on Algorithms (ESA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 244, pp. 90:1-90:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.ESA.2022.90

Abstract

Dynamic Time Warping (DTW) is a widely used similarity measure for comparing strings that encode time series data, with applications to areas including bioinformatics, signature verification, and speech recognition. The standard dynamic-programming algorithm for DTW takes O(n²) time, and there are conditional lower bounds showing that no algorithm can do substantially better. In many applications, however, the strings x and y may contain long runs of repeated letters, meaning that they can be compressed using run-length encoding. A natural question is whether the DTW-distance between these compressed strings can be computed efficiently in terms of the lengths k and 𝓁 of the compressed strings. Recent work has shown how to achieve O(k𝓁² + 𝓁 k²) time, leaving open the question of whether a near-quadratic Õ(k𝓁)-time algorithm might exist. We show that, if a small approximation loss is permitted, then a near-quadratic time algorithm is indeed possible: our algorithm computes a (1 + ε)-approximation for DTW(x, y) in Õ(k𝓁 / ε³) time, where k and 𝓁 are the number of runs in x and y. Our algorithm allows for DTW to be computed over any metric space (Σ, δ) in which distances are O(log n)-bit integers. Surprisingly, the algorithm also works even if δ does not induce a metric space on Σ (e.g., δ need not satisfy the triangle inequality).

Subject Classification

ACM Subject Classification
  • Theory of computation → Approximation algorithms analysis
Keywords
  • Dynamic time warping distance
  • approximation algorithms
  • run-length encodings
  • computational geometry

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. John Aach and George M Church. Aligning gene expression time series with time warping algorithms. Bioinformatics, 17(6):495-508, 2001. Google Scholar
  2. Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tight hardness results for lcs and other sequence similarity measures. In 56th Annual Symposium on Foundations of Computer Science (FOCS), pages 59-78. IEEE, 2015. Google Scholar
  3. Pankaj K. Agarwal, Kyle Fox, Jiangwei Pan, and Rex Ying. Approximating dynamic time warping and edit distance for a pair of point sequences. In 32nd International Symposium on Computational Geometry, SoCG 2016, June 14-18, 2016, Boston, MA, USA, pages 6:1-6:16, 2016. URL: https://doi.org/10.4230/LIPIcs.SoCG.2016.6.
  4. Alexandr Andoni, Robert Krauthgamer, and Krzysztof Onak. Polylogarithmic approximation for edit distance and the asymmetric query complexity. In Proceedings of the 51st Annual Symposium on Foundations of Computer Science (FOCS), pages 377-386. IEEE, 2010. Google Scholar
  5. Alexandr Andoni and Negev Shekel Nosatzki. Edit distance in near-linear time: It’s a constant factor. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 990-1001. IEEE, 2020. Google Scholar
  6. Alexandr Andoni and Krzysztof Onak. Approximating edit distance in near-linear time. SIAM J. Comput., 41(6):1635-1648, 2012. Google Scholar
  7. Alberto Apostolico, Gad M Landau, and Steven Skiena. Matching for run-length encoded strings. IEEE, 1997. Google Scholar
  8. Ora Arbell, Gad M Landau, and Joseph SB Mitchell. Edit distance of run-length encoded strings. Information Processing Letters, 83(6):307-314, 2002. Google Scholar
  9. Ziv Bar-Yossef, T.S. Jayram, Robert Krauthgamer, and Ravi Kumar. Approximating edit distance efficiently. In Proceedings of 45th Annual Symposium on Foundations of Computer Science (FOCS), pages 550-559. IEEE, 2004. Google Scholar
  10. Joshua Brakensiek and Aviad Rubinstein. Constant-factor approximation of near-linear edit distance in near-linear time. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 685-698, 2020. Google Scholar
  11. Vladimir Braverman, Moses Charikar, William Kuszmaul, David Woodruff, and Lin Yang. The one-way communication complexity of dynamic time warping distance. Manuscript submitted for publication, 2018. Google Scholar
  12. Karl Bringmann and Marvin Künnemann. Quadratic conditional lower bounds for string problems and dynamic time warping. In 56th Annual Symposium on Foundations of Computer Science (FOCS), pages 79-97. IEEE, 2015. Google Scholar
  13. Horst Bunke and János Csirik. Edit distance of run-length coded strings. In Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing: Technological challenges of the 1990’s, pages 137-143, 1992. Google Scholar
  14. EG Caiani, A Porta, G Baselli, M Turiel, S Muzzupappa, F Pieruzzi, C Crema, A Malliani, and S Cerutti. Warped-average template technique to track on a cycle-by-cycle basis the cardiac filling phases on left ventricular volume. In Computers in Cardiology 1998, pages 73-76. IEEE, 1998. Google Scholar
  15. Diptarka Chakraborty, Debarati Das, Elazar Goldenberg, Michal Kouckỳ, and Michael Saks. Approximating edit distance within constant factor in truly sub-quadratic time. Journal of the ACM (JACM), 67(6):1-22, 2020. Google Scholar
  16. Diptarka Chakraborty, Elazar Goldenberg, and Michal Kouckỳ. Streaming algorithms for embedding and computing edit distance in the low distance regime. In Proceedings of the 48th Annual Symposium on Theory of Computing (STOC), pages 712-725. ACM, 2016. Google Scholar
  17. Moses Charikar, Ofir Geri, Michael P Kim, and William Kuszmaul. On estimating edit distance: Alignment, dimension reduction, and embeddings. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018. Google Scholar
  18. Kuan-Yu Chen and Kun-Mao Chao. A fully compressed algorithm for computing the edit distance of run-length encoded strings. Algorithmica, 65(2):354-370, 2013. Google Scholar
  19. Raphaël Clifford, Pawel Gawrychowski, Tomasz Kociumaka, Daniel P Martin, and Przemyslaw Uznanski. Rle edit distance in near optimal time. In 44th International Symposium on Mathematical Foundations of Computer Science (MFCS 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019. Google Scholar
  20. Alexander De Luca, Alina Hang, Frederik Brudy, Christian Lindner, and Heinrich Hussmann. Touch me once and i know it’s you!: implicit authentication based on touch screen patterns. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 987-996. ACM, 2012. Google Scholar
  21. Vincent Froese, Brijnesh J. Jain, Maciej Rymar, and Mathias Weller. Fast exact dynamic time warping on run-length encoded time series. CoRR, abs/1903.03003, 2019. URL: http://arxiv.org/abs/1903.03003.
  22. Omer Gold and Micha Sharir. Dynamic time warping and geometric edit distance: Breaking the quadratic barrier. In Ioannis Chatzigiannakis, Piotr Indyk, Fabian Kuhn, and Anca Muscholl, editors, 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017, July 10-14, 2017, Warsaw, Poland, volume 80 of LIPIcs, pages 25:1-25:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017. URL: https://doi.org/10.4230/LIPIcs.ICALP.2017.25.
  23. Guan Shieng Huang, Jia Jie Liu, and Yue Li Wang. Sequence alignment algorithms for run-length-encoded strings. In International Computing and Combinatorics Conference, pages 319-330. Springer, 2008. Google Scholar
  24. Michal Kouckỳ and Michael Saks. Constant factor approximations to edit distance on far input pairs in nearly linear time. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 699-712, 2020. Google Scholar
  25. William Kuszmaul. Dynamic time warping in strongly subquadratic time: Algorithms for the low-distance regime and approximate evaluation. In Christel Baier, Ioannis Chatzigiannakis, Paola Flocchini, and Stefano Leonardi, editors, 46th International Colloquium on Automata, Languages, and Programming, ICALP 2019, July 9-12, 2019, Patras, Greece, volume 132 of LIPIcs, pages 80:1-80:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019. URL: https://doi.org/10.4230/LIPIcs.ICALP.2019.80.
  26. William Kuszmaul. Binary dynamic time warping in linear time. arXiv preprint, 2021. URL: http://arxiv.org/abs/2101.01108.
  27. Gad M. Landau, Eugene W. Myers, and Jeanette P. Schmidt. Incremental string comparison. SIAM Journal on Computing, 27(2):557-582, 1998. Google Scholar
  28. T. Warren Liao. Clustering of time series data - A survey. Pattern Recognit., 38(11):1857-1874, 2005. URL: https://doi.org/10.1016/j.patcog.2005.01.025.
  29. Jia Jie Liu, Guan-Shieng Huang, Yue-Li Wang, and Richard CT Lee. Edit distance for a run-length-encoded string and an uncompressed string. Information Processing Letters, 105(1):12-16, 2007. Google Scholar
  30. Veli Mäkinen, Esko Ukkonen, and Gonzalo Navarro. Approximate matching of run-length compressed strings. Algorithmica, 35(4):347-369, 2003. Google Scholar
  31. William J Masek and Michael S Paterson. A faster algorithm computing string edit distances. Journal of Computer and System sciences, 20(1):18-31, 1980. Google Scholar
  32. J Mitchell. A geometric shortest path problem, with application to computing a longest common subsequence in run-length encoded strings. Tech-nical Report, Department of Applied Mathemat-ics, SUNY StonyBrook, NY, 1997. Google Scholar
  33. Lindasalwa Muda, Mumtaj Begam, and Irraivan Elamvazuthi. Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint, 2010. URL: http://arxiv.org/abs/1003.4083.
  34. Mario E Munich and Pietro Perona. Continuous dynamic time warping for translation-invariant curve alignment with applications to signature verification. In Proceedings of 7th International Conference on Computer Vision, volume 1, pages 108-115, 1999. Google Scholar
  35. Saul B. Needleman and Christian D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of molecular biology, 48(3):443-453, 1970. Google Scholar
  36. Yoshifumi Sakai and Shunsuke Inenaga. A faster reduction of the dynamic time warping distance to the longest increasing subsequence length. arXiv preprint, 2020. URL: http://arxiv.org/abs/2005.09169.
  37. Yoshifumi Sakai and Shunsuke Inenaga. A reduction of the dynamic time warping distance to the longest increasing subsequence length. In 31st International Symposium on Algorithms and Computation (ISAAC 2020). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2020. Google Scholar
  38. Nathan Schaar, Vincent Froese, and Rolf Niedermeier. Faster binary mean computation under dynamic time warping. arXiv preprint, 2020. URL: http://arxiv.org/abs/2002.01178.
  39. Anooshiravan Sharabiani, Houshang Darabi, Samuel Harford, Elnaz Douzali, Fazle Karim, Hereford Johnson, and Shun Chen. Asymptotic dynamic time warping calculation with utilizing value repetition. Knowl. Inf. Syst., 57(2):359-388, 2018. URL: https://doi.org/10.1007/s10115-018-1163-4.
  40. Taras K. Vintsyuk. Speech discrimination by dynamic programming. Cybernetics, 4(1):52-57, 1968. Google Scholar
  41. Zoe Xi and William Kuszmaul. Approximating dynamic time warping distance between run-length encoded strings. CoRR, abs/2207.00915, 2022. URL: http://arxiv.org/abs/2207.00915.
  42. Rex Ying, Jiangwei Pan, Kyle Fox, and Pankaj K Agarwal. A simple efficient approximation algorithm for dynamic time warping. In Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, page 21. ACM, 2016. Google Scholar
  43. Yunyue Zhu and Dennis Shasha. Warping indexes with envelope transforms for query by humming. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 181-192. ACM, 2003. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail