Locality-Sensitive Hashing of Curves

Authors Anne Driemel, Francesco Silvestri



PDF
Thumbnail PDF

File

LIPIcs.SoCG.2017.37.pdf
  • Filesize: 0.55 MB
  • 16 pages

Document Identifiers

Author Details

Anne Driemel
Francesco Silvestri

Cite AsGet BibTex

Anne Driemel and Francesco Silvestri. Locality-Sensitive Hashing of Curves. In 33rd International Symposium on Computational Geometry (SoCG 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 77, pp. 37:1-37:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)
https://doi.org/10.4230/LIPIcs.SoCG.2017.37

Abstract

We study data structures for storing a set of polygonal curves in R^d such that, given a query curve, we can efficiently retrieve similar curves from the set, where similarity is measured using the discrete Fréchet distance or the dynamic time warping distance. To this end we devise the first locality-sensitive hashing schemes for these distance measures. A major challenge is posed by the fact that these distance measures internally optimize the alignment between the curves. We give solutions for different types of alignments including constrained and unconstrained versions. For unconstrained alignments, we improve over a result by Indyk [SoCG 2002] for short curves. Let n be the number of input curves and let m be the maximum complexity of a curve in the input. In the particular case where m <= (a/(4d)) log n, for some fixed a>0, our solutions imply an approximate near-neighbor data structure for the discrete Fréchet distance that uses space in O(n^(1+a) log n) and achieves query time in O(n^a log^2 n) and constant approximation factor. Furthermore, our solutions provide a trade-off between approximation quality and computational performance: for any parameter k in [m], we can give a data structure that uses space in O(2^(2k) m^(k-1) n log n + nm), answers queries in O( 2^(2k) m^(k) log n) time and achieves approximation factor in O(m/k).
Keywords
  • Locality-Sensitive Hashing
  • Frechet distance
  • Dynamic Time Warping

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. S. Arya, D. Mount, A. Vigneron, and J. Xia. Space-time tradeoffs for proximity searching in doubling spaces. In Proc. 16th European Symp. Algorithms (ESA), pages 112-123, 2008. Google Scholar
  2. A. Backurs and A. Sidiropoulos. Constant-distortion embeddings of Hausdorff metrics into constant-dimensional l_p spaces. In Proc. 19th Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX), volume 60, pages 1:1-1:15, 2016. Google Scholar
  3. Y. Bartal, L. A. Gottlieb, and O. Neiman. On the impossibility of dimension reduction for doubling subsets of 𝓁_p. In Proc. 13th Symp. on Computational Geometry (SOCG), pages 60:60-60:66, 2014. Google Scholar
  4. K. Bringmann. Why Walking the Dog Takes Time: Fréchet Distance Has No Strongly Subquadratic Algorithms Unless SETH Fails. In Proc. 55th Symp. on Foundations of Computer Science (FOCS), pages 661-670, 2014. Google Scholar
  5. J. C. Brown and P. J. O. Miller. Automatic classification of killer whale vocalizations using dynamic time warping. J. of the Acoustical Society of America, 122(2):1201-1207, 2007. Google Scholar
  6. J. Campbell, J. Tremblay, and C. Verbrugge. Clustering player paths. In Proc. 10th Int'l Conf. on the Foundations of Digital Games (FDG), 2015. Google Scholar
  7. M. de Berg, A. F. Cook, and J. Gudmundsson. Fast Fréchet queries. Comput. Geom., 46(6):747-755, 2013. Google Scholar
  8. A. Driemel and S. Har-Peled. Jaywalking your dog: Computing the Fréchet distance with shortcuts. SIAM J. Computing, 42(5):1830-1866, 2013. Google Scholar
  9. A. Driemel, A. Krivošija, and C. Sohler. Clustering time series under the Fréchet distance. In Proc. 27th Symp. on Discrete Algorithms (SODA), pages 766-785, 2016. Google Scholar
  10. A. Driemel and F. Silvesstri. Locality-sensitive hashing of curves. Arxiv:1703.04040, 2017. Google Scholar
  11. G. Forestier, F. Lalys, L. Riffaud, B. Trelhu, and P. Jannin. Classification of surgical processes using dynamic time warping. J. Biomedical Informatics, 45(2):255-264, 2012. Google Scholar
  12. J. Gudmundsson and N. Valladares. A GPU approach to subtrajectory clustering using the Fréchet distance. IEEE Trans. on Parallel and Distributed Systems, 26(4):924-937, 2015. Google Scholar
  13. Anupam Gupta, Robert Krauthgamer, and James R Lee. Bounded geometries, fractals, and low-distortion embeddings. In Proc. 44th Symp. Found. Comp. Science (FOCS), pages 534-543, 2003. Google Scholar
  14. S. Har-Peled, P. Indyk, and R. Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of Computing, 8(1):321-350, 2012. Google Scholar
  15. B. Huang and W. Kinsner. ECG frame classification using dynamic time warping. In Proc. Canadian Conf. on Electrical and Computer Engineering, volume 2, pages 1105-1110, 2002. Google Scholar
  16. P. Indyk. On approximate nearest neighbors in non-euclidean spaces. In Proc. 39th Symp. on Foundations of Computer Science, pages 148-155, 1998. Google Scholar
  17. P. Indyk. Approximate nearest neighbor algorithms for Fréchet distance via product metrics. In Proc. 18th Symp. on Computational Geometry (SOCG), pages 102-106, 2002. Google Scholar
  18. P. Indyk and J. Matoušek. Low-distortion embeddings of finite metric spaces. In Handbook of Discrete and Computational Geometry, pages 177-196. CRC Press, 2004. Google Scholar
  19. P. Indyk and R. Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proc. 30th Symp. Theory of Computing (STOC), pages 604-613, 1998. Google Scholar
  20. R. J. Kenefic. Track clustering using Fréchet distance and minimum description length. J. of Aerospace Information Systems, 11(8):512-524, 2014. Google Scholar
  21. E. Keogh and C. A. Ratanamahatana. Exact indexing of dynamic time warping. Knowledge and Information Systems, 7(3):358-386, 2005. Google Scholar
  22. Z. M. Kovacs-Vajna. A fingerprint verification system based on triangular matching and dynamic time warping. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1266-1276, 2000. Google Scholar
  23. B. Legrand, C. S. Chang, S. H. Ong, S. Y. Neo, and N. Palanisamy. Chromosome classification using dynamic time warping. Pattern Recognition Letters, 29(3):215-222, 2008. Google Scholar
  24. Q. Lv, W. Josephson, Z. Wang, M. Charikar, and K. Li. Multi-probe lsh: Efficient indexing for high-dimensional similarity search. In Proc. 33rd Int'l Conf. on Very Large Data Bases, VLDB'07, pages 950-961. VLDB Endowment, 2007. Google Scholar
  25. J. Matoušek. Embedding finite metric spaces into euclidean spaces. In Lectures on Discrete Geometry, chapter 15. Springer, 2002. Google Scholar
  26. T. Rakthanmanon, B. Campana, A. Mueen, G. Batista, B. Westover, Q. Zhu, J. Zakaria, and E. Keogh. Searching and mining trillions of time series subsequences under dynamic time warping. In Proc. 18th Conf. Knowl. Disc. and Data Mining, pages 262-270, 2012. Google Scholar
  27. C. A. Ratanamahatana and E. Keogh. Three myths about dynamic time warping data mining. In Proc. SIAM Conf. on Data Mining (SDM), pages 506-510, 2005. Google Scholar
  28. G. Shakhnarovich, T. Darrell, and P. Indyk, editors. Nearest-Neighbor Methods in Learning and Vision: Theory and Practice. MIT Press, 2006. Google Scholar
  29. A. Shrivastava and P. Li. Asymmetric LSH (ALSH) for sublinear time maximum inner product search (MIPS). In Proc. 27th Conf. on Neural Information Processing Systems (NIPS), pages 2321-2329, 2014. Google Scholar
  30. G. K. D. Vries. Kernel methods for vessel trajectories. PhD thesis, Univ. Amsterdam, 2012. Google Scholar
  31. H. Zhu, J. Luo, H. Yin, X. Zhou, J. Z. Huang, and F. B. Zhan. Mining trajectory corridors using Fréchet distance and meshing grids. In Proc. 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pages 228-237, 2010. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail