Fast Cross-Polytope Locality-Sensitive Hashing

Authors Christopher Kennedy, Rachel Ward



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2017.53.pdf
  • Filesize: 0.6 MB
  • 16 pages

Document Identifiers

Author Details

Christopher Kennedy
Rachel Ward

Cite As Get BibTex

Christopher Kennedy and Rachel Ward. Fast Cross-Polytope Locality-Sensitive Hashing. In 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 67, pp. 53:1-53:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017) https://doi.org/10.4230/LIPIcs.ITCS.2017.53

Abstract

We provide a variant of cross-polytope locality sensitive hashing with respect to angular distance which is provably optimal in asymptotic sensitivity and enjoys \mathcal{O}(d \ln d ) hash computation time.  Building on a recent result in (Andoni, Indyk, Laarhoven, Razenshteyn '15), we show that optimal asymptotic sensitivity for cross-polytope LSH is retained even when the dense Gaussian matrix is replaced by a fast Johnson-Lindenstrauss transform followed by discrete pseudo-rotation, reducing the hash computation time from \mathcal{O}(d^2) to \mathcal{O}(d \ln d ).  Moreover, our scheme achieves the optimal rate of convergence for sensitivity. By incorporating a low-randomness Johnson-Lindenstrauss transform, our scheme can be modified to require only \mathcal{O}(\ln^9(d)) random bits.

Subject Classification

Keywords
  • Locality-sensitive hashing
  • Dimension reduction
  • Johnson-Lindenstrauss lemma

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Nir Ailon and Bernard Chazelle. The fast Johnson-Lindenstrauss transform and approximate nearest neighbors. SIAM J. Comput., 39(1):302-322, May 2009. URL: http://dx.doi.org/10.1137/060673096.
  2. Noga Alon. Problems and results in extremal combinatorics. Discrete Mathematics, 273:31-53, 2003. Google Scholar
  3. Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, FOCS '06, pages 459-468, Washington, DC, USA, 2006. IEEE Computer Society. URL: http://dx.doi.org/10.1109/FOCS.2006.49.
  4. Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn, and Ludwig Schmidt. Practical and optimal lsh for angular distance. In Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS'15, pages 1225-1233, Cambridge, MA, USA, 2015. MIT Press. URL: http://dl.acm.org/citation.cfm?id=2969239.2969376.
  5. Alexandr Andoni, Piotr Indyk, Huy L Nguyen, and Ilya Razenshteyn. Beyond Locality-Sensitive Hashing. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1018-1028. SIAM, 2014. Google Scholar
  6. Alexandr Andoni and Ilya Razenshteyn. Optimal data-dependent hashing for approximate near neighbors. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC '15, pages 793-801, New York, NY, USA, 2015. ACM. URL: http://dx.doi.org/10.1145/2746539.2746553.
  7. Alexandr Andoni and Ilya Razenshteyn. Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing. ArXiv e-prints, July 2015. URL: http://arxiv.org/abs/1507.04299.
  8. Anja Becker and Thijs Laarhoven. Efficient (ideal) lattice sieving using cross-polytope lsh. Cryptology ePrint Archive, Report 2015/823, 2015. URL: http://eprint.iacr.org/.
  9. Anirban Dasgupta, Ravi Kumar, and Tamás Sarlós. Fast locality-sensitive hashing. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1073-1081. ACM, 2011. Google Scholar
  10. Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab Mirrokni. Locality sensitive hashing scheme based on p-stable distributions. In Proceedings of the tmentieth annual Symposium on Computational Geometry. New York, pages 253-262, 2004. Google Scholar
  11. Ishay Haviv and Oded Regev. The restricted isometry property of subsampled fourier matrices. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '16, pages 288-297, Philadelphia, PA, USA, 2016. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=2884435.2884457.
  12. Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC '98, pages 604-613, New York, NY, USA, 1998. ACM. URL: http://dx.doi.org/10.1145/276698.276876.
  13. William B Johnson and Joram Lindenstrauss. Extensions of lipschitz mappings into a hilbert space. Contemporary mathematics, 26:189-206, 1984. Google Scholar
  14. Daniel M. Kane and Jelani Nelson. Sparser johnson-lindenstrauss transforms. J. ACM, 61(1):4:1-4:23, January 2014. URL: http://dx.doi.org/10.1145/2559902.
  15. Felix Krahmer and Rachel Ward. New and improved Johnson-Lindenstrauss embeddings via the restricted isometry property. SIAM Journal on Mathematical Analysis, 43(3):1269-1281, 2011. Google Scholar
  16. Kasper Green Larsen and Jelani Nelson. The johnson-lindenstrauss lemma is optimal for linear dimensionality reduction. arXiv preprint arXiv:1411.2404, 2014. Google Scholar
  17. Yue Lin, Rong Jin, Deng Cai, Shuicheng Yan, and Xuelong Li. Compressed hashing. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, CVPR '13, pages 446-451, Washington, DC, USA, 2013. IEEE Computer Society. URL: http://dx.doi.org/10.1109/CVPR.2013.64.
  18. Ting Liu, Andrew W. Moore, Ke Yang, and Alexander G. Gray. An investigation of practical approximate nearest neighbor algorithms. In L. K. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems 17, pages 825-832. MIT Press, 2005. URL: http://papers.nips.cc/paper/2666-an-investigation-of-practical-approximate-nearest-neighbor-algorithms.pdf.
  19. Rajeev Motwani, Assaf Naor, and Rina Panigrahi. Lower bounds on Locality Sensitive Hashing. In Proceedings of the Twenty-second Annual Symposium on Computational Geometry, SCG '06, pages 154-157, New York, NY, USA, 2006. ACM. URL: http://dx.doi.org/10.1145/1137856.1137881.
  20. Ryan O'Donnell, Yi Wu, and Yuan Zhou. Optimal lower bounds for Locality-Sensitive Hashing (except when Q is tiny). ACM Trans. Comput. Theory, 6(1):5:1-5:13, March 2014. URL: http://dx.doi.org/10.1145/2578221.
  21. Andrei Osipov. A Randomized Approximate Nearest Neighbors Algorithm. PhD thesis, Yale University, New Haven, CT, USA, 2011. AAI3467911. Google Scholar
  22. Kengo Terasawa and Yuzuru Tanaka. Spherical LSH for approximate nearest neighbor search on unit hypersphere. In Algorithms and Data Structures, pages 27-38. Springer, 2007. Google Scholar
  23. Roger Weber, Hans-Jörg Schek, and Stephen Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In Proceedings of the 24rd International Conference on Very Large Data Bases, VLDB '98, pages 194-205, San Francisco, CA, USA, 1998. Morgan Kaufmann Publishers Inc. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail