License
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.FSTTCS.2012.48
URN: urn:nbn:de:0030-drops-38470
URL: http://drops.dagstuhl.de/opus/volltexte/2012/3847/
Go to the corresponding Portal


Vempala, Santosh S.

Randomly-oriented k-d Trees Adapt to Intrinsic Dimension

pdf-format:
Document 1.pdf (425 KB)


Abstract

The classic k-d tree data structure continues to be widely used in spite of its vulnerability to the so-called curse of dimensionality. Here we provide a rigorous explanation: for randomly rotated data, a k-d tree adapts to the intrinsic dimension of the data and is not affected by the ambient dimension, thus keeping the data structure efficient for objects such as low-dimensional manifolds and sparse data. The main insight of the analysis can be used as an algorithmic pre-processing step to realize the same benefit: rotate the data randomly; then build a k-d tree. Our work can be seen as a refinement of Random Projection trees [Dasgupta 2008], which also adapt to intrinsic dimension but incur higher traversal costs as the resulting cells are polyhedra and not cuboids. Using k-d trees after a random rotation results in cells that are cuboids, thus preserving the traversal efficiency of standard k-d trees.

BibTeX - Entry

@InProceedings{vempala:LIPIcs:2012:3847,
  author =	{Santosh S. Vempala},
  title =	{{Randomly-oriented k-d Trees Adapt to Intrinsic Dimension}},
  booktitle =	{IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2012) },
  pages =	{48--57},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-47-7},
  ISSN =	{1868-8969},
  year =	{2012},
  volume =	{18},
  editor =	{Deepak D'Souza and Telikepalli Kavitha and Jaikumar Radhakrishnan},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2012/3847},
  URN =		{urn:nbn:de:0030-drops-38470},
  doi =		{http://dx.doi.org/10.4230/LIPIcs.FSTTCS.2012.48},
  annote =	{Keywords: Data structures, Nearest Neighbors, Intrinsic Dimension, k-d Tree}
}

Keywords: Data structures, Nearest Neighbors, Intrinsic Dimension, k-d Tree
Seminar: IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2012)
Issue Date: 2012
Date of publication: 10.12.2012


DROPS-Home | Fulltext Search | Imprint Published by LZI