Document Open Access Logo

Polytopes, Lattices, and Spherical Codes for the Nearest Neighbor Problem

Author Thijs Laarhoven



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2020.76.pdf
  • Filesize: 0.85 MB
  • 14 pages

Document Identifiers

Author Details

Thijs Laarhoven
  • Eindhoven University of Technology, The Netherlands

Cite AsGet BibTex

Thijs Laarhoven. Polytopes, Lattices, and Spherical Codes for the Nearest Neighbor Problem. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 76:1-76:14, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.ICALP.2020.76

Abstract

We study locality-sensitive hash methods for the nearest neighbor problem for the angular distance, focusing on the approach of first projecting down onto a random low-dimensional subspace, and then partitioning the projected vectors according to the Voronoi cells induced by a well-chosen spherical code. This approach generalizes and interpolates between the fast but asymptotically suboptimal hyperplane hashing of Charikar [STOC 2002], and asymptotically optimal but practically often slower hash families of e.g. Andoni - Indyk [FOCS 2006], Andoni - Indyk - Nguyen - Razenshteyn [SODA 2014] and Andoni - Indyk - Laarhoven - Razenshteyn - Schmidt [NIPS 2015]. We set up a framework for analyzing the performance of any spherical code in this context, and we provide results for various codes appearing in the literature, such as those related to regular polytopes and root lattices. Similar to hyperplane hashing, and unlike e.g. cross-polytope hashing, our analysis of collision probabilities and query exponents is exact and does not hide any order terms which vanish only for large d, thus facilitating an easier parameter selection in practical applications. For the two-dimensional case, we analytically derive closed-form expressions for arbitrary spherical codes, and we show that the equilateral triangle is optimal, achieving a better performance than the two-dimensional analogues of hyperplane and cross-polytope hashing. In three and four dimensions, we numerically find that the tetrahedron and 5-cell (the 3-simplex and 4-simplex) and the 16-cell (the 4-orthoplex) achieve the best query exponents, while in five or more dimensions orthoplices appear to outperform regular simplices, as well as the root lattice families A_k and D_k in terms of minimizing the query exponent. We provide lower bounds based on spherical caps, and we predict that in higher dimensions, larger spherical codes exist which outperform orthoplices in terms of the query exponent, and we argue why using the D_k root lattices will likely lead to better results in practice as well (compared to using cross-polytopes), due to a better trade-off between the asymptotic query exponent and the concrete costs of hashing.

Subject Classification

ACM Subject Classification
  • Theory of computation → Nearest neighbor algorithms
  • Theory of computation → Random projections and metric embeddings
  • Theory of computation → Computational geometry
Keywords
  • (approximate) nearest neighbor problem
  • spherical codes
  • polytopes
  • lattices
  • locality-sensitive hashing (LSH)

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Dimitris Achlioptas. Database-friendly random projections. In PODS, pages 274-281, 2001. URL: https://doi.org/10.1145/375551.375608.
  2. Martin R. Albrecht, Léo Ducas, Gottfried Herold, Elena Kirshanova, Eamonn Postlethwaite, and Marc Stevens. The general sieve kernel and new records in lattice reduction. In EUROCRYPT, pages 717-746, 2019. Google Scholar
  3. Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS, pages 459-468, 2006. URL: https://doi.org/10.1109/FOCS.2006.49.
  4. Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn, and Ludwig Schmidt. Practical and optimal LSH for angular distance. In NIPS, pages 1225-1233, 2015. URL: https://papers.nips.cc/paper/5893-practical-and-optimal-lsh-for-angular-distance.
  5. Alexandr Andoni, Piotr Indyk, Huy Lê Nguyên, and Ilya Razenshteyn. Beyond locality-sensitive hashing. In SODA, pages 1018-1028, 2014. URL: https://doi.org/10.1137/1.9781611973402.76.
  6. Alexandr Andoni, Thijs Laarhoven, Ilya Razenshteyn, and Erik Waingarten. Optimal hashing-based time-space trade-offs for approximate near neighbors. In SODA, pages 47-66, 2017. URL: https://doi.org/10.1137/1.9781611974782.4.
  7. Alexandr Andoni and Ilya Razenshteyn. Optimal data-dependent hashing for approximate near neighbors. In STOC, pages 793-801, 2015. URL: https://doi.org/10.1145/2746539.2746553.
  8. Sunil Arya, David M. Mount, Nathan S. Netanyahu, Ruth Silverman, and Angela Y. Wu. An optimal algorithm for approximate nearest neighbor searching in fixed dimensions. In SODA, pages 573-582, 1994. URL: http://dl.acm.org/citation.cfm?id=314464.314652.
  9. Martin Aumueller, Erik Bernhardsson, and Alexander Faithfull. ANN benchmarks - available online at http://sss.projects.itu.dk/ann-benchmarks/, 2017. URL: http://sss.projects.itu.dk/ann-benchmarks/.
  10. Albert Baernstein and B.A. Taylor. Spherical rearrangements, subharmonic functions, and ∗ -functions in n -space. Duke Math. J., 43(2):245-268, June 1976. URL: https://doi.org/10.1215/S0012-7094-76-04322-2.
  11. Anja Becker, Léo Ducas, Nicolas Gama, and Thijs Laarhoven. New directions in nearest neighbor searching with applications to lattice sieving. In SODA, pages 10-24, 2016. URL: https://doi.org/10.1137/1.9781611974331.ch2.
  12. Anja Becker and Thijs Laarhoven. Efficient (ideal) lattice sieving using cross-polytope LSH. In AFRICACRYPT, pages 3-23, 2016. URL: https://doi.org/10.1007/978-3-319-31517-1_1.
  13. Erik Bernhardsson. ANN benchmarks - available online at https://github.com/erikbern/ann-benchmarks, 2016. URL: https://github.com/erikbern/ann-benchmarks.
  14. Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer-Verlag, 2006. URL: https://www.springer.com/us/book/9780387310732.
  15. Karthekeyan Chandrasekaran, Daniel Dadush, Venkata Gandikota, and Elena Grigorescu. Lattice-based locality sensitive hashing is optimal. In ITCS, 2018. Google Scholar
  16. Moses S. Charikar. Similarity estimation techniques from rounding algorithms. In STOC, pages 380-388, 2002. URL: https://doi.org/10.1145/509907.509965.
  17. Tobias Christiani. A framework for similarity search with space-time tradeoffs using locality-sensitive filtering. In SODA, pages 31-46, 2017. URL: https://doi.org/10.1137/1.9781611974782.3.
  18. Persi Diaconis and David Freedman. A dozen de Finetti-style results in search of a theory. Annales de l'IHP Probabilités et statistiques, 23(2):397-423, 1987. Google Scholar
  19. Moshe Dubiner. Bucketing coding and information theory for the statistical high-dimensional nearest-neighbor problem. IEEE Transactions on Information Theory, 56(8):4166-4179, August 2010. URL: https://doi.org/10.1109/TIT.2010.2050814.
  20. Richard O. Duda, Peter E. Hart, and David G. Stork. Pattern Classification (2nd Edition). Wiley, 2000. URL: https://dl.acm.org/citation.cfm?id=954544.
  21. Alan Genz, Frank Bretz, Tetsuhisa Miwa, Xuefei Mi, Friedrich Leisch, Fabian Scheipl, and Torsten Hothorn. mvtnorm: Multivariate normal and t distributions, 2019. R package version 1.0-10. URL: https://CRAN.R-project.org/package=mvtnorm.
  22. Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In STOC, pages 604-613, 1998. URL: https://doi.org/10.1145/276698.276876.
  23. Tiefeng Jiang. How many entries of a typical orthogonal matrix can be approximated by independent normals? The Annals of Probability, 34(4):1497-1529, 2006. URL: https://doi.org/10.1214/009117906000000205.
  24. Christopher Kennedy and Rachel Ward. Fast cross-polytope locality-sensitive hashing. In ITCS, pages 1-16, 2017. URL: https://doi.org/10.4230/LIPIcs.ITCS.2017.53.
  25. Thijs Laarhoven. Sieving for shortest vectors in lattices using angular locality-sensitive hashing. In CRYPTO, pages 3-22, 2015. URL: https://doi.org/10.1007/978-3-662-47989-6_1.
  26. Thijs Laarhoven. Tradeoffs for nearest neighbors on the sphere. arXiv:1511.07527 [cs.DS], pages 1-16, 2015. URL: http://arxiv.org/abs/1511.07527.
  27. Thijs Laarhoven. Hypercube LSH for approximate near neighbors. In MFCS, pages 1-20, 2017. URL: https://doi.org/10.4230/LIPIcs.MFCS.2017.7.
  28. Artur Mariano, Thijs Laarhoven, and Christian Bischof. Parallel (probable) lock-free HashSieve: a practical sieving algorithm for the SVP. In ICPP, pages 590-599, 2015. URL: https://doi.org/10.1109/ICPP.2015.68.
  29. Artur Mariano, Thijs Laarhoven, and Christian Bischof. A parallel variant of LDSieve for the SVP on lattices. In PDP, pages 23-30, 2017. URL: https://doi.org/10.1109/PDP.2017.60.
  30. Alexander May and Ilya Ozerov. On computing nearest neighbors with applications to decoding of binary linear codes. In EUROCRYPT, pages 203-228, 2015. URL: https://doi.org/10.1007/978-3-662-46800-5_9.
  31. Ilya Razenshteyn and Ludwig Schmidt. FALCONN - available online at https://falconn-lib.org/, 2016. URL: https://falconn-lib.org/.
  32. Gregory Shakhnarovich, Trevor Darrell, and Piotr Indyk. Nearest-Neighbor Methods in Learning and Vision: Theory and Practice. MIT Press, 2005. URL: http://ttic.uchicago.edu/~gregory/annbook/book.html.
  33. William F. Sheppard. On the application of the theory of error to cases of normal distribution and normal correlation. Philosophical Transactions of the Royal Society A, 192:101-167, 1899. Google Scholar
  34. Neil J.A. Sloane. Spherical codes: nice arrangements of points on a sphere in various dimensions. URL: http://neilsloane.com/packings/.
  35. Kengo Terasawa and Yuzuru Tanaka. Spherical LSH for approximate nearest neighbor search on unit hypersphere. In WADS, pages 27-38, 2007. URL: https://doi.org/10.1007/978-3-540-73951-7_4.
  36. Kengo Terasawa and Yuzuru Tanaka. Approximate nearest neighbor search for a dataset of normalized vectors. In IEICE Transactions on Information and Systems, volume 92(9), pages 1609-1619, 2009. URL: http://search.ieice.org/bin/summary.php?id=e92-d_9_1609.
  37. Serge Vladut. Lattices with exponentially large kissing numbers. Moscow J. Comb. Number Th., 8:163-177, 2019. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail