Approximate Nearest Neighbor for Curves - Simple, Efficient, and Deterministic

Authors Arnold Filtser, Omrit Filtser, Matthew J. Katz



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2020.48.pdf
  • Filesize: 0.57 MB
  • 19 pages

Document Identifiers

Author Details

Arnold Filtser
  • Department of Computer Science, Columbia University, New York, NY, USA
Omrit Filtser
  • Department of Applied Mathematics and Statistics, Stony Brook University, NY, USA
Matthew J. Katz
  • Department of Computer Science, Ben-Gurion University of the Negev, Beer Sheva, Israel

Acknowledgements

We wish to thank Boris Aronov for helpful discussions on the problems studied in this paper.

Cite AsGet BibTex

Arnold Filtser, Omrit Filtser, and Matthew J. Katz. Approximate Nearest Neighbor for Curves - Simple, Efficient, and Deterministic. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 48:1-48:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.ICALP.2020.48

Abstract

In the (1+ε,r)-approximate near-neighbor problem for curves (ANNC) under some similarity measure δ, the goal is to construct a data structure for a given set 𝒞 of curves that supports approximate near-neighbor queries: Given a query curve Q, if there exists a curve C ∈ 𝒞 such that δ(Q,C)≤ r, then return a curve C' ∈ 𝒞 with δ(Q,C') ≤ (1+ε)r. There exists an efficient reduction from the (1+ε)-approximate nearest-neighbor problem to ANNC, where in the former problem the answer to a query is a curve C ∈ 𝒞 with δ(Q,C) ≤ (1+ε)⋅δ(Q,C^*), where C^* is the curve of 𝒞 most similar to Q. Given a set 𝒞 of n curves, each consisting of m points in d dimensions, we construct a data structure for ANNC that uses n⋅ O(1/ε)^{md} storage space and has O(md) query time (for a query curve of length m), where the similarity measure between two curves is their discrete Fréchet or dynamic time warping distance. Our method is simple to implement, deterministic, and results in an exponential improvement in both query time and storage space compared to all previous bounds. Further, we also consider the asymmetric version of ANNC, where the length of the query curves is k ≪ m, and obtain essentially the same storage and query bounds as above, except that m is replaced by k. Finally, we apply our method to a version of approximate range counting for curves and achieve similar bounds.

Subject Classification

ACM Subject Classification
  • Theory of computation → Computational geometry
  • Theory of computation → Design and analysis of algorithms
Keywords
  • polygonal curves
  • Fréchet distance
  • dynamic time warping
  • approximation algorithms
  • (asymmetric) approximate nearest neighbor
  • range counting

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Peyman Afshani and Anne Driemel. On the complexity of range searching among curves. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, pages 898-917, 2018. URL: https://doi.org/10.1137/1.9781611975031.58.
  2. Boris Aronov, Omrit Filtser, Michael Horton, Matthew J. Katz, and Khadijeh Sheikhan. Efficient nearest-neighbor query and clustering of planar curves. In Algorithms and Data Structures - 16th International Symposium, WADS 2019, Edmonton, AB, Canada, August 5-7, 2019, Proceedings, pages 28-42, 2019. URL: https://doi.org/10.1007/978-3-030-24766-9_3.
  3. Sergey Bereg, Minghui Jiang, Wencheng Wang, Boting Yang, and Binhai Zhu. Simplifying 3d polygonal chains under the discrete Fréchet distance. In LATIN 2008: Theoretical Informatics, 8th Latin American Symposium, Búzios, Brazil, April 7-11, 2008, Proceedings, pages 630-641, 2008. URL: https://doi.org/10.1007/978-3-540-78773-0_54.
  4. Karl Bringmann. Why walking the dog takes time: Fréchet distance has no strongly subquadratic algorithms unless SETH fails. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 661-670, 2014. URL: https://doi.org/10.1109/FOCS.2014.76.
  5. Kevin Buchin, Anne Driemel, Joachim Gudmundsson, Michael Horton, Irina Kostitsyna, Maarten Löffler, and Martijn Struijs. Approximating (k, l)-center clustering for curves. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, pages 2922-2938, 2019. URL: https://doi.org/10.1137/1.9781611975482.181.
  6. Mark de Berg, Atlas F. Cook IV, and Joachim Gudmundsson. Fast Fréchet queries. Comput. Geom., 46(6):747-755, 2013. URL: https://doi.org/10.1016/j.comgeo.2012.11.006.
  7. Mark de Berg, Joachim Gudmundsson, and Ali D. Mehrabi. A dynamic data structure for approximate proximity queries in trajectory data. In Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, GIS 2017, Redondo Beach, CA, USA, November 7-10, 2017, pages 48:1-48:4, 2017. URL: https://doi.org/10.1145/3139958.3140023.
  8. Anne Driemel and Sariel Har-Peled. Jaywalking your dog: Computing the Fréchet distance with shortcuts. SIAM J. Comput., 42(5):1830-1866, 2013. URL: https://doi.org/10.1137/120865112.
  9. Anne Driemel, Ioannis Psarros, and Melanie Schmidt. Sublinear data structures for short Fréchet queries. CoRR, abs/1907.04420, 2019. URL: http://arxiv.org/abs/1907.04420.
  10. Anne Driemel and Francesco Silvestri. Locality-Sensitive Hashing of Curves. In Proceedings of the 33rd International Symposium on Computational Geometry, volume 77, pages 37:1-37:16, Brisbane, Australia, July 2017. Schloss Dagstuhl-Leibniz-Zentrum für Informatik. URL: https://doi.org/10.4230/LIPIcs.SoCG.2017.37.
  11. Michael Elkin, Arnold Filtser, and Ofer Neiman. Terminal embeddings. Theor. Comput. Sci., 697:1-36, 2017. URL: https://doi.org/10.1016/j.tcs.2017.06.021.
  12. Ioannis Z. Emiris and Ioannis Psarros. Products of Euclidean metrics and applications to proximity questions among curves. In 34th International Symposium on Computational Geometry, SoCG 2018, June 11-14, 2018, Budapest, Hungary, pages 37:1-37:13, 2018. URL: https://doi.org/10.4230/LIPIcs.SoCG.2018.37.
  13. Sariel Har-Peled, Piotr Indyk, and Rajeev Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of Computing, 8(1):321-350, 2012. URL: https://doi.org/10.4086/toc.2012.v008a014.
  14. Piotr Indyk. High-dimensional computational geometry. PhD thesis, Stanford University, 2000. see URL: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.72.3077&rep=rep1&type=pdf.
  15. Piotr Indyk. Approximate nearest neighbor algorithms for Fréchet distance via product metrics. In Proceedings of the 8th Symposium on Computational Geometry, pages 102-106, Barcelona, Spain, June 2002. ACM Press. URL: https://doi.org/10.1145/513400.513414.
  16. William Johnson and Joram Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26:189–206, 1984. URL: https://doi.org/10.1090/conm/026/737400.
  17. Piyush Kumar, Joseph S. B. Mitchell, and E. Alper Yildirim. Comuting core-sets and approximate smallest enclosing hyperspheres in high dimensions. In Proceedings of the Fifth Workshop on Algorithm Engineering and Experiments, Baltimore, MD, USA, January 11, 2003, pages 45-55, 2003. Google Scholar
  18. Sepideh Mahabadi, Konstantin Makarychev, Yury Makarychev, and Ilya P. Razenshteyn. Nonlinear dimension reduction via outer bi-Lipschitz extensions. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 1088-1101, 2018. URL: https://doi.org/10.1145/3188745.3188828.
  19. Nimrod Megiddo. Linear programming in linear time when the dimension is fixed. J. ACM, 31(1):114-127, 1984. URL: https://doi.org/10.1145/2422.322418.
  20. Shyam Narayanan and Jelani Nelson. Optimal terminal dimensionality reduction in Euclidean space. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019, pages 1064-1069, 2019. URL: https://doi.org/10.1145/3313276.3316307.
  21. Rasmus Pagh and Flemming Friche Rodler. Cuckoo hashing. Journal of Algorithms, 51(2):122-144, 2004. URL: https://doi.org/10.1016/j.jalgor.2003.12.002.
  22. Gregory Shakhnarovich, Trevor Darrell, and Piotr Indyk. Nearest-neighbor methods in learning and vision: theory and practice (neural information processing). The MIT press, 2006. see URL: http://people.csail.mit.edu/gregory/annbook/book.html.