Search Results

Documents authored by Aumüller, Martin


Document
Invited Talk
Algorithm Engineering for High-Dimensional Similarity Search Problems (Invited Talk)

Authors: Martin Aumüller

Published in: LIPIcs, Volume 160, 18th International Symposium on Experimental Algorithms (SEA 2020)


Abstract
Similarity search problems in high-dimensional data arise in many areas of computer science such as data bases, image analysis, machine learning, and natural language processing. One of the most prominent problems is finding the k nearest neighbors of a data point q ∈ ℝ^d in a large set of data points S ⊂ ℝ^d, under same distance measure such as Euclidean distance. In contrast to lower dimensional settings, we do not know of worst-case efficient data structures for such search problems in high-dimensional data, i.e., data structures that are faster than a linear scan through the data set. However, there is a rich body of (often heuristic) approaches that solve nearest neighbor search problems much faster than such a scan on many real-world data sets. As a necessity, the term solve means that these approaches give approximate results that are close to the true k-nearest neighbors. In this talk, we survey recent approaches to nearest neighbor search and related problems. The talk consists of three parts: (1) What makes nearest neighbor search difficult? (2) How do current state-of-the-art algorithms work? (3) What are recent advances regarding similarity search on GPUs, in distributed settings, or in external memory?

Cite as

Martin Aumüller. Algorithm Engineering for High-Dimensional Similarity Search Problems (Invited Talk). In 18th International Symposium on Experimental Algorithms (SEA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 160, pp. 1:1-1:3, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{aumuller:LIPIcs.SEA.2020.1,
  author =	{Aum\"{u}ller, Martin},
  title =	{{Algorithm Engineering for High-Dimensional Similarity Search Problems}},
  booktitle =	{18th International Symposium on Experimental Algorithms (SEA 2020)},
  pages =	{1:1--1:3},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-148-1},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{160},
  editor =	{Faro, Simone and Cantone, Domenico},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2020.1},
  URN =		{urn:nbn:de:0030-drops-120751},
  doi =		{10.4230/LIPIcs.SEA.2020.1},
  annote =	{Keywords: Nearest neighbor search, Benchmarking}
}
Document
PUFFINN: Parameterless and Universally Fast FInding of Nearest Neighbors

Authors: Martin Aumüller, Tobias Christiani, Rasmus Pagh, and Michael Vesterli

Published in: LIPIcs, Volume 144, 27th Annual European Symposium on Algorithms (ESA 2019)


Abstract
We present PUFFINN, a parameterless LSH-based index for solving the k-nearest neighbor problem with probabilistic guarantees. By parameterless we mean that the user is only required to specify the amount of memory the index is supposed to use and the result quality that should be achieved. The index combines several heuristic ideas known in the literature. By small adaptions to the query algorithm, we make heuristics rigorous. We perform experiments on real-world and synthetic inputs to evaluate implementation choices and show that the implementation satisfies the quality guarantees while being competitive with other state-of-the-art approaches to nearest neighbor search. We describe a novel synthetic data set that is difficult to solve for almost all existing nearest neighbor search approaches, and for which PUFFINN significantly outperform previous methods.

Cite as

Martin Aumüller, Tobias Christiani, Rasmus Pagh, and Michael Vesterli. PUFFINN: Parameterless and Universally Fast FInding of Nearest Neighbors. In 27th Annual European Symposium on Algorithms (ESA 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 144, pp. 10:1-10:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{aumuller_et_al:LIPIcs.ESA.2019.10,
  author =	{Aum\"{u}ller, Martin and Christiani, Tobias and Pagh, Rasmus and Vesterli, Michael},
  title =	{{PUFFINN: Parameterless and Universally Fast FInding of Nearest Neighbors}},
  booktitle =	{27th Annual European Symposium on Algorithms (ESA 2019)},
  pages =	{10:1--10:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-124-5},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{144},
  editor =	{Bender, Michael A. and Svensson, Ola and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2019.10},
  URN =		{urn:nbn:de:0030-drops-111317},
  doi =		{10.4230/LIPIcs.ESA.2019.10},
  annote =	{Keywords: Nearest Neighbor Search, Locality-Sensitive Hashing, Adaptive Similarity Search}
}
Document
Theory and Applications of Hashing (Dagstuhl Seminar 17181)

Authors: Martin Dietzfelbinger, Michael Mitzenmacher, Rasmus Pagh, David P. Woodruff, and Martin Aumüller

Published in: Dagstuhl Reports, Volume 7, Issue 5 (2018)


Abstract
This report documents the program and the topics discussed of the 4-day Dagstuhl Seminar 17181 "Theory and Applications of Hashing", which took place May 1-5, 2017. Four long and eighteen short talks covered a wide and diverse range of topics within the theme of the workshop. The program left sufficient space for informal discussions among the 40 participants.

Cite as

Martin Dietzfelbinger, Michael Mitzenmacher, Rasmus Pagh, David P. Woodruff, and Martin Aumüller. Theory and Applications of Hashing (Dagstuhl Seminar 17181). In Dagstuhl Reports, Volume 7, Issue 5, pp. 1-21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@Article{dietzfelbinger_et_al:DagRep.7.5.1,
  author =	{Dietzfelbinger, Martin and Mitzenmacher, Michael and Pagh, Rasmus and Woodruff, David P. and Aum\"{u}ller, Martin},
  title =	{{Theory and Applications of Hashing (Dagstuhl Seminar 17181)}},
  pages =	{1--21},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2017},
  volume =	{7},
  number =	{5},
  editor =	{Dietzfelbinger, Martin and Mitzenmacher, Michael and Pagh, Rasmus and Woodruff, David P. and Aum\"{u}ller, Martin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.7.5.1},
  URN =		{urn:nbn:de:0030-drops-82788},
  doi =		{10.4230/DagRep.7.5.1},
  annote =	{Keywords: connections to complexity theory, data streaming applications, hash function construction and analysis, hashing primitives, information retrieval applications, locality-sensitive hashing, machine learning applications}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail