Locality Sensitive Hashing for Efficient Similar Polygon Retrieval
Locality Sensitive Hashing (LSH) is an effective method of indexing a set of items to support efficient nearest neighbors queries in high-dimensional spaces. The basic idea of LSH is that similar items should produce hash collisions with higher probability than dissimilar items.
We study LSH for (not necessarily convex) polygons, and use it to give efficient data structures for similar shape retrieval. Arkin et al. [Arkin et al., 1991] represent polygons by their "turning function" - a function which follows the angle between the polygon’s tangent and the x-axis while traversing the perimeter of the polygon. They define the distance between polygons to be variations of the L_p (for p = 1,2) distance between their turning functions. This metric is invariant under translation, rotation and scaling (and the selection of the initial point on the perimeter) and therefore models well the intuitive notion of shape resemblance.
We develop and analyze LSH near neighbor data structures for several variations of the L_p distance for functions (for p = 1,2). By applying our schemes to the turning functions of a collection of polygons we obtain efficient near neighbor LSH-based structures for polygons. To tune our structures to turning functions of polygons, we prove some new properties of these turning functions that may be of independent interest.
As part of our analysis, we address the following problem which is of independent interest. Find the vertical translation of a function f that is closest in L₁ distance to a function g. We prove tight bounds on the approximation guarantee obtained by the translation which is equal to the difference between the averages of g and f.
Locality sensitive hashing
polygons
turning function
L_p distance
nearest neighbors
similarity search
Theory of computation~Data structures design and analysis
Theory of computation~Computational geometry
Information systems~Information retrieval
46:1-46:16
Regular Paper
This work was supported by ISF grant no. 1595/19, GIF grant no. 1367/2017 and the Blavatnik Foundation.
https://arxiv.org/abs/2101.04339
Haim
Kaplan
Haim Kaplan
School of Computer Science, Tel Aviv University, Israel
Jay
Tenenbaum
Jay Tenenbaum
School of Computer Science, Tel Aviv University, Israel
10.4230/LIPIcs.STACS.2021.46
Alexandr Andoni and Ilya Razenshteyn. Optimal data-dependent hashing for approximate near neighbors. In STOC, pages 793-801. ACM, 2015.
Esther M Arkin, L Paul Chew, Daniel P Huttenlocher, Klara Kedem, and Joseph S Mitchell. An efficiently computable metric for comparing polygonal shapes. Technical report, Cornell University, 1991.
Maria Astefanoaei, Paul Cesaretti, Panagiota Katsikouli, Mayank Goswami, and Rik Sarkar. Multi-resolution sketches and locality sensitive hashing for fast trajectory processing. In SIGSPATIAL, pages 279-288. ACM, 2018.
Artem Babenko, Anton Slesarev, Alexandr Chigorin, and Victor Lempitsky. Neural codes for image retrieval. In ECCV, pages 584-599. Springer, 2014.
Dana H Ballard. Generalizing the hough transform to detect arbitrary shapes. Pattern Recognition, 13(2):111-122, 1981.
Ilaria Bartolini, Paolo Ciaccia, and Marco Patella. Using the time warping distance for fourier-based shape retrieval. Technical report, IEIIT-BO-03-02, 2002.
Dusan Cakmakov and Emilija Celakoska. Estimation of curve similarity using turning functions. International Journal of Applied Mathematics, 15:403-416, 2004.
Matteo Ceccarello, Anne Driemel, and Francesco Silvestri. Fresh: Fréchet similarity with hashing. In WADS, pages 254-268. Springer, 2019.
Edgar Chávez, Ana C Chávez Cáliz, and Jorge L López-López. Affine invariants of generalized polygons and matching under affine transformations. Computational Geometry, 58:60-69, 2016.
Anne Driemel and Francesco Silvestri. Locality-sensitive hashing of curves. In SOCG, pages 37:1-37:16, 2017.
Arnold Filtser, Omrit Filtser, and Matthew J Katz. Approximate nearest neighbor for curves - simple, efficient, and deterministic. arXiv preprint, 2019. URL: http://arxiv.org/abs/1902.07562.
http://arxiv.org/abs/1902.07562
Kristen Grauman and Trevor Darrell. Fast contour matching using approximate earth mover’s distance. In CVPR, pages I-220. IEEE, 2004.
Joachim Gudmundsson and Rasmus Pagh. Range-efficient consistent sampling and locality-sensitive hashing for polygons. In ISAAC, 2017.
Sariel Har-Peled, Piotr Indyk, and Rajeev Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of computing, 8(1):321-350, 2012.
Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. In STOC, pages 604-613. ACM, 1998.
Yehezkel Lamdan and Haim J Wolfson. Geometric hashing: A general and efficient model-based recognition scheme. In ICCV, page 238–249, 1988.
Lambert Schomaker, Edward de Leau, and Louis Vuurpijl. Using pen-based outlines for object-based annotation and image-based queries. In AVI, pages 585-592. Springer, 1999.
Shinji Umeyama. Parameterized point pattern matching and its application to recognition of object families. TPAMI, 15(2):136-144, 1993.
Remco C Veltkamp and Michiel Hagedoorn. State of the art in shape matching. In Principles of visual information retrieval, pages 87-119. Springer, 2001.
Charles T Zahn and Ralph Z Roskies. Fourier descriptors for plane closed curves. TOC, 100(3):269-281, 1972.
Haim Kaplan and Jay Tenenbaum
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode