Approximate Nearest Neighbors Search Without False Negatives For l_2 For c>sqrt{loglog{n}}

Authors Piotr Sankowski, Piotr Wygocki



PDF
Thumbnail PDF

File

LIPIcs.ISAAC.2017.63.pdf
  • Filesize: 0.56 MB
  • 12 pages

Document Identifiers

Author Details

Piotr Sankowski
Piotr Wygocki

Cite As Get BibTex

Piotr Sankowski and Piotr Wygocki. Approximate Nearest Neighbors Search Without False Negatives For l_2 For c>sqrt{loglog{n}}. In 28th International Symposium on Algorithms and Computation (ISAAC 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 92, pp. 63:1-63:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017) https://doi.org/10.4230/LIPIcs.ISAAC.2017.63

Abstract

In this paper, we report progress on answering the open problem presented by Pagh [11], who considered the near neighbor search without false negatives for the Hamming distance. We show new data structures for solving the c-approximate near neighbors problem without false negatives for Euclidean high dimensional space
\mathcal{R}^d. These data structures work for any c = \omega(\sqrt{\log{\log{n}}}), where n is the number of points in the input set, with poly-logarithmic query time and polynomial
pre-processing time. This improves over the known algorithms, which require c to be \Omega(\sqrt{d}). 

This improvement is obtained by applying a sequence of reductions, which are interesting on their own. First, we reduce the problem to d instances of dimension logarithmic in n. Next, these instances are reduced to a number of c-approximate near neighbor search without false negatives instances in \big(\Rspace^k\big)^L space equipped with metric m(x,y) = \max_{1 \le i \leL}(\dist{x_i - y_i}_2).

Subject Classification

Keywords
  • locality sensitive hashing
  • approximate near neighbor search
  • high- dimensional
  • similarity search

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Thomas Dybdahl Ahle. Optimal las vegas locality sensitive data structures. CoRR, abs/1704.02054, 2017. URL: http://arxiv.org/abs/1704.02054.
  2. Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM, 51(1):117-122, 2008. URL: http://dx.doi.org/10.1145/1327452.1327494.
  3. Alexandr Andoni and Ilya Razenshteyn. Optimal data-dependent hashing for approximate near neighbors. In Rocco A. Servedio and Ronitt Rubinfeld, editors, Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015, pages 793-801. ACM, 2015. URL: http://dx.doi.org/10.1145/2746539.2746553.
  4. Bernard Chazelle, Ding Liu, and Avner Magen. Approximate range searching in higher dimension. Computational Geometry, 39(1):24-29, 2008. URL: http://dx.doi.org/10.1016/j.comgeo.2007.05.008.
  5. Piotr Indyk. On approximate nearest neighbors in non-euclidean spaces. In 39th Annual Symposium on Foundations of Computer Science, FOCS'98, November 8-11, 1998, Palo Alto, California, USA, pages 148-155, 1998. URL: http://dx.doi.org/10.1109/SFCS.1998.743438.
  6. Piotr Indyk. Uncertainty principles, extractors, and explicit embeddings of l2 into l1. In Proceedings of the Thirty-ninth Annual ACM Symposium on Theory of Computing, STOC'07, pages 615-620, New York, NY, USA, 2007. ACM. URL: http://dx.doi.org/10.1145/1250790.1250881.
  7. Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC'98, pages 604-613, New York, NY, USA, 1998. ACM. URL: http://dx.doi.org/10.1145/276698.276876.
  8. William Johnson and Joram Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. In Conference in modern analysis and probability (New Haven, Conn., 1982), volume 26 of Contemporary Mathematics, pages 189-206. American Mathematical Society, 1984. Google Scholar
  9. Ryan O'Donnell, Yi Wu, and Yuan Zhou. Optimal lower bounds for locality-sensitive hashing (except when q is tiny). ACM Trans. Comput. Theory, 6(1):5:1-5:13, March 2014. URL: http://dx.doi.org/10.1145/2578221.
  10. Andrzej Pacuk, Piotr Sankowski, Karol Wegrzycki, and Piotr Wygocki. Locality-sensitive hashing without false negatives for l_p. In Computing and Combinatorics - 22nd International Conference, COCOON 2016, Ho Chi Minh City, Vietnam, August 2-4, 2016, Proceedings, pages 105-118, 2016. URL: http://dx.doi.org/10.1007/978-3-319-42634-1_9.
  11. Rasmus Pagh. Locality-sensitive hashing without false negatives. In Proceedings of the Twenty-seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA'16, pages 1-9, Philadelphia, PA, USA, 2016. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=2884435.2884436.
  12. Ryan Williams. A new algorithm for optimal 2-constraint satisfaction and its implications. Theor. Comput. Sci., 348(2):357-365, December 2005. URL: http://dx.doi.org/10.1016/j.tcs.2005.09.023.
  13. Piotr Wygocki. On fast bounded locality sensitive hashing. ArXiv e-prints, 2017. URL: http://arxiv.org/abs/1704.05902.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail