Laarhoven, Thijs
Hypercube LSH for Approximate near Neighbors
Abstract
A celebrated technique for finding near neighbors for the angular distance involves using a set of random hyperplanes to partition the space into hash regions [Charikar, STOC 2002]. Experiments later showed that using a set of orthogonal hyperplanes, thereby partitioning the space into the Voronoi regions induced by a hypercube, leads to even better results [Terasawa and Tanaka, WADS 2007]. However, no theoretical explanation for this improvement was ever given, and it remained unclear how the resulting hypercube hash method scales in high dimensions.
In this work, we provide explicit asymptotics for the collision probabilities when using hypercubes to partition the space. For instance, two nearorthogonal vectors are expected to collide with probability (1/pi)^d in dimension d, compared to (1/2)^d when using random hyperplanes. Vectors at angle pi/3 collide with probability (sqrt[3]/pi)^d, compared to (2/3)^d for random hyperplanes, and nearparallel vectors collide with similar asymptotic probabilities in both cases.
For capproximate nearest neighbor searching, this translates to a decrease in the exponent rho of localitysensitive hashing (LSH) methods of a factor up to log2(pi) ~ 1.652 compared to hyperplane LSH. For c = 2, we obtain rho ~ 0.302 for hypercube LSH, improving upon the rho ~ 0.377 for hyperplane LSH. We further describe how to use hypercube LSH in practice, and we consider an example application in the area of lattice algorithms.
BibTeX  Entry
@InProceedings{laarhoven:LIPIcs:2017:8092,
author = {Thijs Laarhoven},
title = {{Hypercube LSH for Approximate near Neighbors}},
booktitle = {42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)},
pages = {7:17:20},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959770460},
ISSN = {18688969},
year = {2017},
volume = {83},
editor = {Kim G. Larsen and Hans L. Bodlaender and JeanFrancois Raskin},
publisher = {Schloss DagstuhlLeibnizZentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2017/8092},
URN = {urn:nbn:de:0030drops80926},
doi = {10.4230/LIPIcs.MFCS.2017.7},
annote = {Keywords: (approximate) near neighbors, localitysensitive hashing, large deviations, dimensionality reduction, lattice algorithms}
}
01.12.2017
Keywords: 

(approximate) near neighbors, localitysensitive hashing, large deviations, dimensionality reduction, lattice algorithms 
Seminar: 

42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)

Issue date: 

2017 
Date of publication: 

01.12.2017 