Lattice-based Locality Sensitive Hashing is Optimal
Locality sensitive hashing (LSH) was introduced by Indyk and Motwani (STOC'98) to give the first sublinear time algorithm for the c-approximate nearest neighbor (ANN) problem using only polynomial space. At a high level, an LSH family hashes "nearby" points to the same bucket and "far away" points to different buckets. The quality of measure of an LSH family is its LSH exponent, which helps determine both query time and space usage.
In a seminal work, Andoni and Indyk (FOCS '06) constructed an LSH family based on random ball partitionings of space that achieves an LSH exponent of 1/c^2 for the l_2 norm, which was later shown to be optimal by Motwani, Naor and Panigrahy (SIDMA '07) and O'Donnell, Wu and Zhou (TOCT '14). Although optimal in the LSH exponent, the ball partitioning approach is computationally expensive. So, in the same work, Andoni and Indyk proposed a simpler and more practical hashing scheme based on Euclidean lattices and provided computational results using the 24-dimensional Leech lattice. However, no theoretical analysis of the scheme was given, thus leaving open the question of finding the exponent of lattice based LSH.
In this work, we resolve this question by showing the existence of lattices achieving the optimal LSH exponent of 1/c^2 using techniques from the geometry of numbers. At a more conceptual level, our results show that optimal LSH space partitions can have periodic structure. Understanding the extent to which additional structure can be imposed on these partitions, e.g. to yield low space and query complexity, remains an important open problem.
Locality Sensitive Hashing
Approximate Nearest Neighbor Search
Random Lattices
42:1-42:18
Regular Paper
Karthekeyan
Chandrasekaran
Karthekeyan Chandrasekaran
Daniel
Dadush
Daniel Dadush
Venkata
Gandikota
Venkata Gandikota
Elena
Grigorescu
Elena Grigorescu
10.4230/LIPIcs.ITCS.2018.42
Divesh Aggarwal, Daniel Dadush, and Noah Stephens-Davidowitz. Solving the closest vector problem in 2ⁿ time - the discrete Gaussian strikes again! In FOCS, pages 563-582, 2015.
Miklós Ajtai. Random lattices and a conjectured 0-1 law about their polynomial time computable properties. In FOCS, pages 733-742. IEEE, 2002.
Ofer Amrani and Yair Beery. Efficient bounded-distance decoding of the hexacode and associated decoders for the leech lattice and the golay code. IEEE Transactions on Communications, 44(5):534-537, 1996.
Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS, pages 459-468. IEEE, 2006.
Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn, and Ludwig Schmidt. Practical and optimal lsh for angular distance. In Advances in Neural Information Processing Systems, pages 1225-1233, 2015.
Alexandr Andoni, Piotr Indyk, Huy L Nguyên, and Ilya Razenshteyn. Beyond locality-sensitive hashing. In SODA, pages 1018-1028. SIAM, 2014.
Alexandr Andoni, Thijs Laarhoven, Ilya Razenshteyn, and Erik Waingarten. Optimal hashing-based time-space trade-offs for approximate near neighbors. In SODA, 2017.
Alexandr Andoni and Ilya Razenshteyn. Optimal data-dependent hashing for approximate near neighbors. In STOC, 2015.
Alexandr Andoni and Ilya Razenshteyn. Tight lower bounds for data-dependent locality-sensitive hashing. arXiv preprint arXiv:1507.04299, 2015.
Anja Becker, Léo Ducas, Nicolas Gama, and Thijs Laarhoven. New directions in nearest neighbor searching with applications to lattice sieving. In SODA, pages 10-24, Philadelphia, PA, USA, 2016. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=2884435.2884437.
http://dl.acm.org/citation.cfm?id=2884435.2884437
L. A. Carraher, P. A. Wilsey, and F. S. Annexstein. A gpgpu algorithm for c-approximate r-nearest neighbor search in high dimensions. In 2013 IEEE International Symposium on Parallel Distributed Processing, pages 2079-2088, May 2013.
Tobias Christiani. A framework for similarity search with space-time tradeoffs using locality-sensitive filtering. In SODA, pages 31-46, Philadelphia, PA, USA, 2017. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=3039686.3039689.
http://dl.acm.org/citation.cfm?id=3039686.3039689
Daniel Dadush and Nicolas Bonifas. Short paths on the Voronoi graph and closest vector problem with preprocessing. In SODA, pages 295-314, 2015.
Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the Twentieth Annual Symposium on Computational Geometry (SOCG), pages 253-262. ACM, 2004.
Léo Ducas and Wessel P. J. van Woerden. The closest vector problem in tensored root lattices of type a and in their duals. Designs, Codes and Cryptography, 2017.
Uri Erez, Simon Litsyn, and Ram Zamir. Lattices which are good for (almost) everything. IEEE Transactions on Information Theory, 51(10):3401-3416, 2005.
Kave Eshghi and Shyamsundar Rajaram. Locality sensitive hash functions based on concomitant rank order statistics. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 221-229. ACM, 2008.
Aristides Gionis, Piotr Indyk, and Rajeev Motwani. Similarity search in high dimensions via hashing. In VLDB, pages 518-529, 1999.
Sariel Har-Peled, Piotr Indyk, and Rajeev Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of computing, 8(1):321-350, 2012.
Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. In STOC, pages 604-613. ACM, 1998.
Hervé Jégou, Laurent Amsaleg, Cordelia Schmid, and Patrick Gros. Query adaptative locality sensitive hashing. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2008., pages 825-828. IEEE, 2008.
Ravi Kannan. Minkowski’s convex body theorem and integer programming. Mathematics of operations research, 12(3):415-440, 1987.
Thijs Laarhoven. Sieving for shortest vectors in lattices using angular locality-sensitive hashing. In Rosario Gennaro and Matthew Robshaw, editors, Advances in Cryptology - CRYPTO, pages 3-22, Berlin, Heidelberg, 2015. Springer Berlin Heidelberg.
R. McKilliam, A. Grant, and I. Clarkson. Finding a closest point in a lattice of voronoi’s first kind. SIAM J. on Discret. Math., 28(3):1405-1422, 2014.
Rajeev Motwani, Assaf Naor, and Rina Panigrahy. Lower bounds on locality sensitive hashing. SIAM Journal on Discrete Mathematics (SIDMA), 21(4):930-935, 2007.
Ryan ODonnell, Yi Wu, and Yuan Zhou. Optimal lower bounds for locality-sensitive hashing (except when q is tiny). ACM Transactions on Computation Theory (TOCT), 6(1):5, 2014. Preliminary version in ICS 2011.
CA Rogers. Lattice coverings of space: the minkowski-hlawka theorem. Proceedings of the London Mathematical Society, 3(3):447-465, 1958.
Wolfgang M Schmidt. The measure of the set of admissible lattices. Proceedings of the American Mathematical Society, 9(3):390-403, 1958.
Gregory Shakhnarovich, Trevor Darrell, and Piotr Indyk. Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing). The MIT press, 2006.
Carl Ludwig Siegel. A mean value theorem in geometry of numbers. Annals of Mathematics, pages 340-347, 1945.
Kengo Terasawa and Yuzuru Tanaka. Spherical lsh for approximate nearest neighbor search on unit hypersphere. Algorithms and Data Structures, pages 27-38, 2007.
G. Valiant. Finding correlations in subquadratic time, with applications to learning parities and juntas. In FOCS, 2012.
Frank Vallentin. Sphere coverings, lattices, and tilings (in low dimensions). PhD thesis, Technical University of Munich, 2003.
Creative Commons Attribution 3.0 Unported license
https://creativecommons.org/licenses/by/3.0/legalcode