Lattice-based Locality Sensitive Hashing is Optimal

Authors Karthekeyan Chandrasekaran, Daniel Dadush, Venkata Gandikota, Elena Grigorescu



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2018.42.pdf
  • Filesize: 0.54 MB
  • 18 pages

Document Identifiers

Author Details

Karthekeyan Chandrasekaran
Daniel Dadush
Venkata Gandikota
Elena Grigorescu

Cite As Get BibTex

Karthekeyan Chandrasekaran, Daniel Dadush, Venkata Gandikota, and Elena Grigorescu. Lattice-based Locality Sensitive Hashing is Optimal. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 94, pp. 42:1-42:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018) https://doi.org/10.4230/LIPIcs.ITCS.2018.42

Abstract

Locality sensitive hashing (LSH) was introduced by Indyk and Motwani (STOC'98) to give the first sublinear time algorithm for the c-approximate nearest neighbor (ANN) problem using only polynomial space. At a high level, an LSH family hashes "nearby" points to the same bucket and "far away" points to different buckets. The quality of measure of an LSH family is its LSH exponent, which helps determine both query time and space usage.
  
In a seminal work, Andoni and Indyk (FOCS '06) constructed an LSH family based on random ball partitionings of space that achieves an LSH exponent of 1/c^2 for the l_2 norm, which was later shown to be optimal by Motwani, Naor and Panigrahy (SIDMA '07) and O'Donnell, Wu and Zhou (TOCT '14). Although optimal in the LSH exponent, the ball partitioning approach is computationally expensive. So, in the same work, Andoni and Indyk proposed a simpler and more practical hashing scheme based on Euclidean lattices and provided computational results using the 24-dimensional Leech lattice. However, no theoretical analysis of the scheme was given, thus leaving open the question of finding the exponent of lattice based LSH.

In this work, we resolve this question by showing the existence of lattices achieving the optimal LSH exponent of 1/c^2 using techniques from the geometry of numbers. At a more conceptual level, our results show that optimal LSH space partitions can have periodic structure. Understanding the extent to which additional structure can be imposed on these partitions, e.g. to yield low space and query complexity, remains an important open problem.

Subject Classification

Keywords
  • Locality Sensitive Hashing
  • Approximate Nearest Neighbor Search
  • Random Lattices

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Divesh Aggarwal, Daniel Dadush, and Noah Stephens-Davidowitz. Solving the closest vector problem in 2ⁿ time - the discrete Gaussian strikes again! In FOCS, pages 563-582, 2015. Google Scholar
  2. Miklós Ajtai. Random lattices and a conjectured 0-1 law about their polynomial time computable properties. In FOCS, pages 733-742. IEEE, 2002. Google Scholar
  3. Ofer Amrani and Yair Beery. Efficient bounded-distance decoding of the hexacode and associated decoders for the leech lattice and the golay code. IEEE Transactions on Communications, 44(5):534-537, 1996. Google Scholar
  4. Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS, pages 459-468. IEEE, 2006. Google Scholar
  5. Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn, and Ludwig Schmidt. Practical and optimal lsh for angular distance. In Advances in Neural Information Processing Systems, pages 1225-1233, 2015. Google Scholar
  6. Alexandr Andoni, Piotr Indyk, Huy L Nguyên, and Ilya Razenshteyn. Beyond locality-sensitive hashing. In SODA, pages 1018-1028. SIAM, 2014. Google Scholar
  7. Alexandr Andoni, Thijs Laarhoven, Ilya Razenshteyn, and Erik Waingarten. Optimal hashing-based time-space trade-offs for approximate near neighbors. In SODA, 2017. Google Scholar
  8. Alexandr Andoni and Ilya Razenshteyn. Optimal data-dependent hashing for approximate near neighbors. In STOC, 2015. Google Scholar
  9. Alexandr Andoni and Ilya Razenshteyn. Tight lower bounds for data-dependent locality-sensitive hashing. arXiv preprint arXiv:1507.04299, 2015. Google Scholar
  10. Anja Becker, Léo Ducas, Nicolas Gama, and Thijs Laarhoven. New directions in nearest neighbor searching with applications to lattice sieving. In SODA, pages 10-24, Philadelphia, PA, USA, 2016. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=2884435.2884437.
  11. L. A. Carraher, P. A. Wilsey, and F. S. Annexstein. A gpgpu algorithm for c-approximate r-nearest neighbor search in high dimensions. In 2013 IEEE International Symposium on Parallel Distributed Processing, pages 2079-2088, May 2013. Google Scholar
  12. Tobias Christiani. A framework for similarity search with space-time tradeoffs using locality-sensitive filtering. In SODA, pages 31-46, Philadelphia, PA, USA, 2017. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=3039686.3039689.
  13. Daniel Dadush and Nicolas Bonifas. Short paths on the Voronoi graph and closest vector problem with preprocessing. In SODA, pages 295-314, 2015. Google Scholar
  14. Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the Twentieth Annual Symposium on Computational Geometry (SOCG), pages 253-262. ACM, 2004. Google Scholar
  15. Léo Ducas and Wessel P. J. van Woerden. The closest vector problem in tensored root lattices of type a and in their duals. Designs, Codes and Cryptography, 2017. Google Scholar
  16. Uri Erez, Simon Litsyn, and Ram Zamir. Lattices which are good for (almost) everything. IEEE Transactions on Information Theory, 51(10):3401-3416, 2005. Google Scholar
  17. Kave Eshghi and Shyamsundar Rajaram. Locality sensitive hash functions based on concomitant rank order statistics. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 221-229. ACM, 2008. Google Scholar
  18. Aristides Gionis, Piotr Indyk, and Rajeev Motwani. Similarity search in high dimensions via hashing. In VLDB, pages 518-529, 1999. Google Scholar
  19. Sariel Har-Peled, Piotr Indyk, and Rajeev Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of computing, 8(1):321-350, 2012. Google Scholar
  20. Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. In STOC, pages 604-613. ACM, 1998. Google Scholar
  21. Hervé Jégou, Laurent Amsaleg, Cordelia Schmid, and Patrick Gros. Query adaptative locality sensitive hashing. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2008., pages 825-828. IEEE, 2008. Google Scholar
  22. Ravi Kannan. Minkowski’s convex body theorem and integer programming. Mathematics of operations research, 12(3):415-440, 1987. Google Scholar
  23. Thijs Laarhoven. Sieving for shortest vectors in lattices using angular locality-sensitive hashing. In Rosario Gennaro and Matthew Robshaw, editors, Advances in Cryptology - CRYPTO, pages 3-22, Berlin, Heidelberg, 2015. Springer Berlin Heidelberg. Google Scholar
  24. R. McKilliam, A. Grant, and I. Clarkson. Finding a closest point in a lattice of voronoi’s first kind. SIAM J. on Discret. Math., 28(3):1405-1422, 2014. Google Scholar
  25. Rajeev Motwani, Assaf Naor, and Rina Panigrahy. Lower bounds on locality sensitive hashing. SIAM Journal on Discrete Mathematics (SIDMA), 21(4):930-935, 2007. Google Scholar
  26. Ryan ODonnell, Yi Wu, and Yuan Zhou. Optimal lower bounds for locality-sensitive hashing (except when q is tiny). ACM Transactions on Computation Theory (TOCT), 6(1):5, 2014. Preliminary version in ICS 2011. Google Scholar
  27. CA Rogers. Lattice coverings of space: the minkowski-hlawka theorem. Proceedings of the London Mathematical Society, 3(3):447-465, 1958. Google Scholar
  28. Wolfgang M Schmidt. The measure of the set of admissible lattices. Proceedings of the American Mathematical Society, 9(3):390-403, 1958. Google Scholar
  29. Gregory Shakhnarovich, Trevor Darrell, and Piotr Indyk. Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing). The MIT press, 2006. Google Scholar
  30. Carl Ludwig Siegel. A mean value theorem in geometry of numbers. Annals of Mathematics, pages 340-347, 1945. Google Scholar
  31. Kengo Terasawa and Yuzuru Tanaka. Spherical lsh for approximate nearest neighbor search on unit hypersphere. Algorithms and Data Structures, pages 27-38, 2007. Google Scholar
  32. G. Valiant. Finding correlations in subquadratic time, with applications to learning parities and juntas. In FOCS, 2012. Google Scholar
  33. Frank Vallentin. Sphere coverings, lattices, and tilings (in low dimensions). PhD thesis, Technical University of Munich, 2003. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail