Dynamic Time Warping in Strongly Subquadratic Time: Algorithms for the Low-Distance Regime and Approximate Evaluation

Author William Kuszmaul



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2019.80.pdf
  • Filesize: 487 kB
  • 15 pages

Document Identifiers

Author Details

William Kuszmaul
  • Massachusetts Institute of Technology, Cambridge, USA

Acknowledgements

The author would like to thank Moses Charikar for his mentoring and advice throughout the project, Ofir Geri for his support and for many useful conversations, and Virginia Williams for suggesting the problem of reducing between edit distance and LCS.

Cite AsGet BibTex

William Kuszmaul. Dynamic Time Warping in Strongly Subquadratic Time: Algorithms for the Low-Distance Regime and Approximate Evaluation. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 80:1-80:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.ICALP.2019.80

Abstract

Dynamic time warping distance (DTW) is a widely used distance measure between time series, with applications in areas such as speech recognition and bioinformatics. The best known algorithms for computing DTW run in near quadratic time, and conditional lower bounds prohibit the existence of significantly faster algorithms. The lower bounds do not prevent a faster algorithm for the important special case in which the DTW is small, however. For an arbitrary metric space Sigma with distances normalized so that the smallest non-zero distance is one, we present an algorithm which computes dtw(x, y) for two strings x and y over Sigma in time O(n * dtw(x, y)). When dtw(x, y) is small, this represents a significant speedup over the standard quadratic-time algorithm. Using our low-distance regime algorithm as a building block, we also present an approximation algorithm which computes dtw(x, y) within a factor of O(n^epsilon) in time O~(n^{2 - epsilon}) for 0 < epsilon < 1. The algorithm allows for the strings x and y to be taken over an arbitrary well-separated tree metric with logarithmic depth and at most exponential aspect ratio. Notably, any polynomial-size metric space can be efficiently embedded into such a tree metric with logarithmic expected distortion. Extending our techniques further, we also obtain the first approximation algorithm for edit distance to work with characters taken from an arbitrary metric space, providing an n^epsilon-approximation in time O~(n^{2 - epsilon}), with high probability. Finally, we turn our attention to the relationship between edit distance and dynamic time warping distance. We prove a reduction from computing edit distance over an arbitrary metric space to computing DTW over the same metric space, except with an added null character (whose distance to a letter l is defined to be the edit-distance insertion cost of l). Applying our reduction to a conditional lower bound of Bringmann and Künnemann pertaining to edit distance over {0, 1}, we obtain a conditional lower bound for computing DTW over a three letter alphabet (with distances of zero and one). This improves on a previous result of Abboud, Backurs, and Williams, who gave a conditional lower bound for DTW over an alphabet of size five. With a similar approach, we also prove a reduction from computing edit distance (over generalized Hamming Space) to computing longest-common-subsequence length (LCS) over an alphabet with an added null character. Surprisingly, this means that one can recover conditional lower bounds for LCS directly from those for edit distance, which was not previously thought to be the case.

Subject Classification

ACM Subject Classification
  • Theory of computation
  • Theory of computation → Design and analysis of algorithms
Keywords
  • dynamic time warping
  • edit distance
  • approximation algorithm
  • tree metrics

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. John Aach and George M Church. Aligning gene expression time series with time warping algorithms. Bioinformatics, 17(6):495-508, 2001. Google Scholar
  2. Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Quadratic-time hardness of LCS and other sequence similarity measures. arXiv preprint, 2015. URL: http://arxiv.org/abs/1501.07053.
  3. Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tight hardness results for LCS and other sequence similarity measures. In 56th Annual Symposium on Foundations of Computer Science (FOCS), pages 59-78, 2015. Google Scholar
  4. Pankaj K. Agarwal, Kyle Fox, Jiangwei Pan, and Rex Ying. Approximating Dynamic Time Warping and Edit Distance for a Pair of Point Sequences. In 32nd International Symposium on Computational Geometry (SoCG), pages 6:1-6:16, 2016. Google Scholar
  5. Alexandr Andoni, Robert Krauthgamer, and Krzysztof Onak. Polylogarithmic approximation for edit distance and the asymmetric query complexity. In Proceedings of the 51st Annual Symposium on Foundations of Computer Science (FOCS), pages 377-386, 2010. Google Scholar
  6. Alexandr Andoni and Krzysztof Onak. Approximating Edit Distance in Near-Linear Time. SIAM J. Comput., 41(6):1635-1648, 2012. Google Scholar
  7. Arturs Backurs and Piotr Indyk. Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). In Proceedings of the 47th Annual Symposium on Theory of Computing (STOC), pages 51-58, 2015. Google Scholar
  8. Nikhil Bansal, Niv Buchbinder, Aleksander Madry, and Joseph Naor. A polylogarithmic-competitive algorithm for the k-server problem. In 52nd Annual Symposium on Foundations of Computer Science (FOCS), pages 267-276, 2011. Google Scholar
  9. Ziv Bar-Yossef, T.S. Jayram, Robert Krauthgamer, and Ravi Kumar. Approximating edit distance efficiently. In Proceedings of 45th Annual Symposium on Foundations of Computer Science (FOCS), pages 550-559, 2004. Google Scholar
  10. Tugkan Batu, Funda Ergün, and Süleyman Cenk Sahinalp. Oblivious string embeddings and edit distance approximations. In Proceedings of the 17th Annual Symposium on Discrete Algorithms (SODA), pages 792-801, 2006. Google Scholar
  11. Nurjahan Begum, Liudmila Ulanova, Jun Wang, and Eamonn J. Keogh. Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 49-58, 2015. Google Scholar
  12. Vladimir Braverman, Moses Charikar, William Kuszmaul, David Woodruff, and Lin Yang. The One-Way Communication Complexity of Dynamic Time Warping Distance. Manuscript submitted for publication, 2018. Google Scholar
  13. Karl Bringmann and Marvin Künnemann. Quadratic conditional lower bounds for string problems and dynamic time warping. In 56th Annual Symposium on Foundations of Computer Science (FOCS), pages 79-97, 2015. Google Scholar
  14. Karl Bringmann and Wolfgang Mulzer. Approximability of the discrete Fréchet distance. Journal of Computational Geometry, 7(2):46-76, 2015. Google Scholar
  15. EG Caiani, A Porta, G Baselli, M Turiel, S Muzzupappa, F Pieruzzi, C Crema, A Malliani, and S Cerutti. Warped-average template technique to track on a cycle-by-cycle basis the cardiac filling phases on left ventricular volume. In Computers in Cardiology 1998, pages 73-76, 1998. Google Scholar
  16. Diptarka Chakraborty, Debarati Das, Elazar Goldenberg, Michal Kouckỳ, and Michael Saks. Approximating Edit Distance Within Constant Factor in Truly Sub-Quadratic Time. In Proceedings of the 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 979-990, 2018. Google Scholar
  17. Diptarka Chakraborty, Elazar Goldenberg, and Michal Kouckỳ. Streaming algorithms for computing edit distance without exploiting suffix trees. arXiv preprint, 2016. URL: http://arxiv.org/abs/1607.03718.
  18. Moses Charikar, Ofir Geri, Michael P. Kim, and William Kuszmaul. On Estimating Edit Distance: Alignment, Dimension Reduction, and Embeddings. In 45th International Colloquium on Automata, Languages, and Programming (ICALP), pages 34:1-34:14, 2018. URL: http://dx.doi.org/10.4230/LIPIcs.ICALP.2018.34.
  19. Alexander De Luca, Alina Hang, Frederik Brudy, Christian Lindner, and Heinrich Hussmann. Touch me once and i know it’s you!: implicit authentication based on touch screen patterns. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 987-996, 2012. Google Scholar
  20. Anne Driemel and Francesco Silvestri. Locality-Sensitive Hashing of Curves. In 33rd International Symposium on Computational Geometry (SoCG), pages 37:1-37:16, 2017. Google Scholar
  21. Ioannis Z. Emiris and Ioannis Psarros. Products of Euclidean Metrics and Applications to Proximity Questions among Curves. In 34th International Symposium on Computational Geometry, SoCG 2018, June 11-14, 2018, Budapest, Hungary, pages 37:1-37:13, 2018. Google Scholar
  22. Jittat Fakcharoenphol, Satish Rao, and Kunal Talwar. A tight bound on approximating arbitrary metrics by tree metrics. Journal of Computer and System Sciences, 69(3):485-497, 2004. Google Scholar
  23. Zvi Galil and Kunsoo Park. An improved algorithm for approximate string matching. SIAM Journal on Computing, 19(6):989-999, 1990. Google Scholar
  24. Toni Giorgino et al. Computing and visualizing dynamic time warping alignments in R: the DTW package. Journal of statistical Software, 31(7):1-24, 2009. Google Scholar
  25. Omer Gold and Micha Sharir. Dynamic Time Warping and Geometric Edit Distance: Breaking the Quadratic Barrier. In 44th International Colloquium on Automata, Languages, and Programming, (ICALP), pages 25:1-25:14, 2017. Google Scholar
  26. Tao Jiang, Guohui Lin, Bin Ma, and Kaizhong Zhang. A general edit distance between RNA structures. Journal of computational biology, 9(2):371-388, 2002. Google Scholar
  27. Eamonn J. Keogh. Exact Indexing of Dynamic Time Warping. In 28th International Conference on Very Large Data Bases (VLDB), pages 406-417, 2002. Google Scholar
  28. Eamonn J. Keogh and Michael J. Pazzani. Scaling up Dynamic Time Warping to Massive Dataset. In Principles of Data Mining and Knowledge Discovery, Third European Conference, (PKDD), pages 1-11, 1999. Google Scholar
  29. Eamonn J. Keogh and Michael J. Pazzani. Scaling up dynamic time warping for datamining applications. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 285-289, 2000. Google Scholar
  30. Jon Kleinberg and Eva Tardos. Algorithm design. Pearson, 2006. Google Scholar
  31. William Kuszmaul. Dynamic Time Warping in Strongly Subquadratic time: Algorithms for the Low-Distance Regime and Approximate Evaluation. arXiv preprint, 2019. URL: http://arxiv.org/abs/1904.09690.
  32. William Kuszmaul. Efficiently Approximating Edit Distance Between Pseudorandom Strings. In Proceedings of the thirtieth annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 2019. Google Scholar
  33. Gad M. Landau, Eugene W. Myers, and Jeanette P. Schmidt. Incremental string comparison. SIAM Journal on Computing, 27(2):557-582, 1998. Google Scholar
  34. Lindasalwa Muda, Mumtaj Begam, and Irraivan Elamvazuthi. Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint, 2010. URL: http://arxiv.org/abs/1003.4083.
  35. Mario E Munich and Pietro Perona. Continuous dynamic time warping for translation-invariant curve alignment with applications to signature verification. In Proceedings of 7th International Conference on Computer Vision, volume 1, pages 108-115, 1999. Google Scholar
  36. Gonzalo Navarro. A guided tour to approximate string matching. ACM computing surveys (CSUR), 33(1):31-88, 2001. Google Scholar
  37. Saul B. Needleman and Christian D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of molecular biology, 48(3):443-453, 1970. Google Scholar
  38. François Petitjean, Germain Forestier, Geoffrey I. Webb, Ann E. Nicholson, Yanping Chen, and Eamonn J. Keogh. Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm. Knowl. Inf. Syst., 47(1):1-26, 2016. Google Scholar
  39. Hiroaki Sakoe and Seibi Chiba. Dynamic programming algorithm optimization for spoken word recognition. IEEE transactions on acoustics, speech, and signal processing, 26(1):43-49, 1978. Google Scholar
  40. Esko Ukkonen. Algorithms for approximate string matching. Information and control, 64(1-3):100-118, 1985. Google Scholar
  41. Taras K. Vintsyuk. Speech discrimination by dynamic programming. Cybernetics, 4(1):52-57, 1968. Google Scholar
  42. Robert A. Wagner and Michael J. Fischer. The String-to-String Correction Problem. J. ACM, 21(1):168-173, 1974. Google Scholar
  43. Rex Ying, Jiangwei Pan, Kyle Fox, and Pankaj K Agarwal. A simple efficient approximation algorithm for dynamic time warping. In Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, page 21, 2016. Google Scholar
  44. Yunyue Zhu and Dennis Shasha. Warping indexes with envelope transforms for query by humming. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 181-192, 2003. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail