DROPS

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2023.39

Subset Sum in Time 2^{n/2} / poly(n)

Authors: Xi Chen, Yaonan Jin, Tim Randolph, and Rocco A. Servedio

Published in: LIPIcs, Volume 275, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)

Abstract

A major goal in the area of exact exponential algorithms is to give an algorithm for the (worst-case) n-input Subset Sum problem that runs in time 2^{(1/2 - c)n} for some constant c > 0. In this paper we give a Subset Sum algorithm with worst-case running time O(2^{n/2} ⋅ n^{-γ}) for a constant γ > 0.5023 in standard word RAM or circuit RAM models. To the best of our knowledge, this is the first improvement on the classical "meet-in-the-middle" algorithm for worst-case Subset Sum, due to Horowitz and Sahni, which can be implemented in time O(2^{n/2}) in these memory models [Horowitz and Sahni, 1974]. Our algorithm combines a number of different techniques, including the "representation method" introduced by Howgrave-Graham and Joux [Howgrave-Graham and Joux, 2010] and subsequent adaptations of the method in Austrin, Kaski, Koivisto, and Nederlof [Austrin et al., 2016], and Nederlof and Węgrzycki [Jesper Nederlof and Karol Wegrzycki, 2021], and "bit-packing" techniques used in the work of Baran, Demaine, and Pǎtraşcu [Baran et al., 2005] on subquadratic algorithms for 3SUM.

Cite as

Xi Chen, Yaonan Jin, Tim Randolph, and Rocco A. Servedio. Subset Sum in Time 2^{n/2} / poly(n). In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 275, pp. 39:1-39:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.APPROX/RANDOM.2023.39,
  author =	{Chen, Xi and Jin, Yaonan and Randolph, Tim and Servedio, Rocco A.},
  title =	{{Subset Sum in Time 2^\{n/2\} / poly(n)}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)},
  pages =	{39:1--39:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-296-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{275},
  editor =	{Megow, Nicole and Smith, Adam},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2023.39},
  URN =		{urn:nbn:de:0030-drops-188641},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2023.39},
  annote =	{Keywords: Exact algorithms, subset sum, log shaving}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ESA.2023.1

On Hashing by (Random) Equations (Invited Talk)

Authors: Martin Dietzfelbinger

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)

Abstract

The talk will consider aspects of the following setup: Assume for each (key) x from a set 𝒰 (the universe) a vector a_x = (a_{x,0},… ,a_{x,{m-1}}) has been chosen. Given a list Z = (z_i)_{i ∈ [m]} of vectors in {0,1}^r we obtain a mapping φ_Z: 𝒰 → {0,1}^r, x ↦ ⟨a_x,Z⟩ := ⨁_{i ∈ [m]} a_{x,i} ⋅ z_i, where ⨁ is bitwise XOR. The simplest way for creating a data structure for calculating φ_Z is to store Z in an array Z[0..m-1] and answer a query for x by returning ⟨ a_x,Z⟩. The length m of the array should be (1+ε)n for some small ε, and calculating this inner product should be fast. In the focus of the talk is the case where for all or for most of the sets S ⊆ 𝒰 of a certain size n the vectors a_x, x ∈ S, are linearly independent. Choosing Z at random will lead to hash families of various degrees of independence. We will be mostly interested in the case where a_x, x ∈ 𝒰 are chosen independently at random from {0,1}^m, according to some distribution 𝒟. We wish to construct (static) retrieval data structures, which means that S ⊂ 𝒰 and some mapping f: S → {0,1}^r are given, and the task is to find Z such that the restriction of φ_Z to S is f. For creating such a data structure it is necessary to solve the linear system (a_x)_{x ∈ S} ⋅ Z = (f(x))_{x ∈ S} for Z. Two problems are central: Under what conditions on m and 𝒟 can we expect the vectors a_x, x ∈ S to be linearly independent, and how can we arrange things so that in this case the system can be solved fast, in particular in time close to linear (in n, assuming a reasonable machine model)? Solutions to these problems, some classical and others that have emerged only in recent years, will be discussed.

Cite as

Martin Dietzfelbinger. On Hashing by (Random) Equations (Invited Talk). In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, p. 1:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger:LIPIcs.ESA.2023.1,
  author =	{Dietzfelbinger, Martin},
  title =	{{On Hashing by (Random) Equations}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{1:1--1:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.1},
  URN =		{urn:nbn:de:0030-drops-186545},
  doi =		{10.4230/LIPIcs.ESA.2023.1},
  annote =	{Keywords: Hashing, Retrieval, Linear equations, Randomness}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2023.16

Work-Efficient Query Evaluation with PRAMs

Authors: Jens Keppeler, Thomas Schwentick, and Christopher Spinrath

Published in: LIPIcs, Volume 255, 26th International Conference on Database Theory (ICDT 2023)

Abstract

The paper studies query evaluation in parallel constant time in the PRAM model. While it is well-known that all relational algebra queries can be evaluated in constant time on an appropriate CRCW-PRAM, this paper is interested in the efficiency of evaluation algorithms, that is, in the number of processors or, asymptotically equivalent, in the work. Naive evaluation in the parallel setting results in huge (polynomial) bounds on the work of such algorithms and in presentations of the result sets that can be extremely scattered in memory. The paper first discusses some obstacles for constant time PRAM query evaluation. It presents algorithms for relational operators that are considerably more efficient than the naive approaches. Further it explores three settings, in which efficient sequential query evaluation algorithms exist: acyclic queries, semi-join algebra queries, and join queries - the latter in the worst-case optimal framework. Under natural assumptions on the representation of the database, the work of the given algorithms matches the best sequential algorithms in the case of semi-join queries, and it comes close in the other two settings. An important tool is the compaction technique from Hagerup (1992).

Cite as

Jens Keppeler, Thomas Schwentick, and Christopher Spinrath. Work-Efficient Query Evaluation with PRAMs. In 26th International Conference on Database Theory (ICDT 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 255, pp. 16:1-16:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{keppeler_et_al:LIPIcs.ICDT.2023.16,
  author =	{Keppeler, Jens and Schwentick, Thomas and Spinrath, Christopher},
  title =	{{Work-Efficient Query Evaluation with PRAMs}},
  booktitle =	{26th International Conference on Database Theory (ICDT 2023)},
  pages =	{16:1--16:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-270-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{255},
  editor =	{Geerts, Floris and Vandevoort, Brecht},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2023.16},
  URN =		{urn:nbn:de:0030-drops-177589},
  doi =		{10.4230/LIPIcs.ICDT.2023.16},
  annote =	{Keywords: PRAM, query evaluation, work-efficient, parallel, acyclic queries, free-connex queries}
}

Document

DOI: 10.4230/LIPIcs.ESA.2022.87

Insertion Time of Random Walk Cuckoo Hashing below the Peeling Threshold

Authors: Stefan Walzer

Published in: LIPIcs, Volume 244, 30th Annual European Symposium on Algorithms (ESA 2022)

Abstract

Most hash tables have an insertion time of 𝒪(1), often qualified as "expected" and/or "amortised". While insertions into cuckoo hash tables indeed seem to take 𝒪(1) expected time in practice, only polylogarithmic guarantees are proven in all but the simplest of practically relevant cases. Given the widespread use of cuckoo hashing to implement compact dictionaries and Bloom filter alternatives, closing this gap is an important open problem for theoreticians. In this paper, we show that random walk insertions into cuckoo hash tables take 𝒪(1) expected amortised time when any number k ≥ 3 of hash functions is used and the load factor is below the corresponding peeling threshold (e.g. ≈0.81 for k = 3). To our knowledge, this is the first meaningful guarantee for constant time insertion for cuckoo hashing that works for k ∈ {3,…,9}. In addition to being useful in its own right, we hope that our key-centred analysis method can be a stepping stone on the path to the true end goal: 𝒪(1) time insertions for all load factors below the load threshold (e.g. ≈0.91 for k = 3).

Cite as

Stefan Walzer. Insertion Time of Random Walk Cuckoo Hashing below the Peeling Threshold. In 30th Annual European Symposium on Algorithms (ESA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 244, pp. 87:1-87:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{walzer:LIPIcs.ESA.2022.87,
  author =	{Walzer, Stefan},
  title =	{{Insertion Time of Random Walk Cuckoo Hashing below the Peeling Threshold}},
  booktitle =	{30th Annual European Symposium on Algorithms (ESA 2022)},
  pages =	{87:1--87:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-247-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{244},
  editor =	{Chechik, Shiri and Navarro, Gonzalo and Rotenberg, Eva and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2022.87},
  URN =		{urn:nbn:de:0030-drops-170250},
  doi =		{10.4230/LIPIcs.ESA.2022.87},
  annote =	{Keywords: Cuckoo Hashing, Random Walk, Random Hypergraph, Peeling, Cores}
}

Document

DOI: 10.4230/LIPIcs.ESA.2019.38

Dense Peelable Random Uniform Hypergraphs

Authors: Martin Dietzfelbinger and Stefan Walzer

Published in: LIPIcs, Volume 144, 27th Annual European Symposium on Algorithms (ESA 2019)

Abstract

We describe a new family of k-uniform hypergraphs with independent random edges. The hypergraphs have a high probability of being peelable, i.e. to admit no sub-hypergraph of minimum degree 2, even when the edge density (number of edges over vertices) is close to 1. In our construction, the vertex set is partitioned into linearly arranged segments and each edge is incident to random vertices of k consecutive segments. Quite surprisingly, the linear geometry allows our graphs to be peeled “from the outside in”. The density thresholds f_k for peelability of our hypergraphs (f_3 ≈ 0.918, f_4 ≈ 0.977, f_5 ≈ 0.992, …) are well beyond the corresponding thresholds (c_3 ≈ 0.818, c_4 ≈ 0.772, c_5 ≈ 0.702, …) of standard k-uniform random hypergraphs. To get a grip on f_k, we analyse an idealised peeling process on the random weak limit of our hypergraph family. The process can be described in terms of an operator on [0,1]^ℤ and f_k can be linked to thresholds relating to the operator. These thresholds are then tractable with numerical methods. Random hypergraphs underlie the construction of various data structures based on hashing, for instance invertible Bloom filters, perfect hash functions, retrieval data structures, error correcting codes and cuckoo hash tables, where inputs are mapped to edges using hash functions. Frequently, the data structures rely on peelability of the hypergraph or peelability allows for simple linear time algorithms. Memory efficiency is closely tied to edge density while worst and average case query times are tied to maximum and average edge size. To demonstrate the usefulness of our construction, we used our 3-uniform hypergraphs as a drop-in replacement for the standard 3-uniform hypergraphs in a retrieval data structure by Botelho et al. [Fabiano Cupertino Botelho et al., 2013]. This reduces memory usage from 1.23m bits to 1.12m bits (m being the input size) with almost no change in running time. Using k > 3 attains, at small sacrifices in running time, further improvements to memory usage.

Cite as

Martin Dietzfelbinger and Stefan Walzer. Dense Peelable Random Uniform Hypergraphs. In 27th Annual European Symposium on Algorithms (ESA 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 144, pp. 38:1-38:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger_et_al:LIPIcs.ESA.2019.38,
  author =	{Dietzfelbinger, Martin and Walzer, Stefan},
  title =	{{Dense Peelable Random Uniform Hypergraphs}},
  booktitle =	{27th Annual European Symposium on Algorithms (ESA 2019)},
  pages =	{38:1--38:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-124-5},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{144},
  editor =	{Bender, Michael A. and Svensson, Ola and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2019.38},
  URN =		{urn:nbn:de:0030-drops-111599},
  doi =		{10.4230/LIPIcs.ESA.2019.38},
  annote =	{Keywords: Random Hypergraphs, Peeling Threshold, 2-Core, Hashing, Retrieval, Succinct Data Structure}
}

Document

DOI: 10.4230/LIPIcs.ESA.2019.39

Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications

Authors: Martin Dietzfelbinger and Stefan Walzer

Published in: LIPIcs, Volume 144, 27th Annual European Symposium on Algorithms (ESA 2019)

Abstract

In this paper we identify a new class of sparse near-quadratic random Boolean matrices that have full row rank over F_2 = {0,1} with high probability and can be transformed into echelon form in almost linear time by a simple version of Gauss elimination. The random matrix with dimensions n(1-epsilon) x n is generated as follows: In each row, identify a block of length L = O((log n)/epsilon) at a random position. The entries outside the block are 0, the entries inside the block are given by fair coin tosses. Sorting the rows according to the positions of the blocks transforms the matrix into a kind of band matrix, on which, as it turns out, Gauss elimination works very efficiently with high probability. For the proof, the effects of Gauss elimination are interpreted as a ("coin-flipping") variant of Robin Hood hashing, whose behaviour can be captured in terms of a simple Markov model from queuing theory. Bounds for expected construction time and high success probability follow from results in this area. They readily extend to larger finite fields in place of F_2. By employing hashing, this matrix family leads to a new implementation of a retrieval data structure, which represents an arbitrary function f: S -> {0,1} for some set S of m = (1-epsilon)n keys. It requires m/(1-epsilon) bits of space, construction takes O(m/epsilon^2) expected time on a word RAM, while queries take O(1/epsilon) time and access only one contiguous segment of O((log m)/epsilon) bits in the representation (O(1/epsilon) consecutive words on a word RAM). The method is readily implemented and highly practical, and it is competitive with state-of-the-art methods. In a more theoretical variant, which works only for unrealistically large S, we can even achieve construction time O(m/epsilon) and query time O(1), accessing O(1) contiguous memory words for a query. By well-established methods the retrieval data structure leads to efficient constructions of (static) perfect hash functions and (static) Bloom filters with almost optimal space and very local storage access patterns for queries.

Cite as

Martin Dietzfelbinger and Stefan Walzer. Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications. In 27th Annual European Symposium on Algorithms (ESA 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 144, pp. 39:1-39:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger_et_al:LIPIcs.ESA.2019.39,
  author =	{Dietzfelbinger, Martin and Walzer, Stefan},
  title =	{{Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications}},
  booktitle =	{27th Annual European Symposium on Algorithms (ESA 2019)},
  pages =	{39:1--39:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-124-5},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{144},
  editor =	{Bender, Michael A. and Svensson, Ola and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2019.39},
  URN =		{urn:nbn:de:0030-drops-111602},
  doi =		{10.4230/LIPIcs.ESA.2019.39},
  annote =	{Keywords: Random Band Matrix, Gauss Elimination, Retrieval, Hashing, Succinct Data Structure, Randomised Data Structure, Robin Hood Hashing, Bloom Filter}
}

@InProceedings{dietzfelbinger_et_al:LIPIcs.ESA.2019.39,
  author =	{Dietzfelbinger, Martin and Walzer, Stefan},
  title =	{{Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications}},
  booktitle =	{27th Annual European Symposium on Algorithms (ESA 2019)},
  pages =	{39:1--39:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-124-5},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{144},
  editor =	{Bender, Michael A. and Svensson, Ola and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2019.39},
  URN =		{urn:nbn:de:0030-drops-111602},
  doi =		{10.4230/LIPIcs.ESA.2019.39},
  annote =	{Keywords: Random Band Matrix, Gauss Elimination, Retrieval, Hashing, Succinct Data Structure, Randomised Data Structure, Robin Hood Hashing, Bloom Filter}
}

Document

DOI: 10.4230/LIPIcs.STACS.2019.24

Constant-Time Retrieval with O(log m) Extra Bits

Authors: Martin Dietzfelbinger and Stefan Walzer

Published in: LIPIcs, Volume 126, 36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019)

Abstract

For a set U (the universe), retrieval is the following problem. Given a finite subset S subseteq U of size m and f : S -> {0,1}^r for a small constant r, build a data structure D_f with the property that for a suitable query algorithm query we have query(D_f,x) = f(x) for all x in S. For x in U setminus S the value query(D_f,x) is arbitrary in {0,1}^r. The number of bits needed for D_f should be (1+epsilon)r m with overhead epsilon = epsilon(m) >= 0 as small as possible, while the query time should be small. Of course, the time for constructing D_f is relevant as well. We assume fully random hash functions on U with constant evaluation time are available. It is known that with epsilon ~= 0.09 one can achieve linear construction time and constant query time, and with overhead epsilon_k ~= e^{-k} it is possible to have O(k) query time and O(m^{1+alpha}) construction time, for arbitrary alpha>0. Furthermore, a theoretical construction with epsilon =O((log log m)/sqrt{log m}) gives constant query time and linear construction time. Known constructions avoiding all overhead, except for a seed value of size O(log log m), require logarithmic query time. In this paper, we present a method for treating the retrieval problem with overhead epsilon = O((log m)/m), which corresponds to O(1) extra memory words (O(log m) bits), and an extremely simple, constant-time query operation. The price to pay is a construction time of O(m^2). We employ the usual framework for retrieval data structures, where construction is effected by solving a sparse linear system of equations over the 2-element field F_2 and a query is effected by a dot product calculation. Our main technical contribution is the design and analysis of a new and natural family of sparse random linear systems with m equations and (1+epsilon)m variables, which combines good locality properties with high probability of having full rank. Paying a larger overhead of epsilon = O((log m)/m^alpha), the construction time can be reduced to O(m^{1+alpha}) for arbitrary constant 0 < alpha < 1. In combination with an adaptation of known techniques for solving sparse linear systems of equations, our approach leads to a highly practical algorithm for retrieval. In a particular benchmark with m = 10^7 we achieve an order-of-magnitude improvement over previous techniques with epsilon = 0.24% instead of the previously best result of epsilon ~= 3%, with better query time and no significant sacrifices in construction time.

Cite as

Martin Dietzfelbinger and Stefan Walzer. Constant-Time Retrieval with O(log m) Extra Bits. In 36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 126, pp. 24:1-24:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger_et_al:LIPIcs.STACS.2019.24,
  author =	{Dietzfelbinger, Martin and Walzer, Stefan},
  title =	{{Constant-Time Retrieval with O(log m) Extra Bits}},
  booktitle =	{36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019)},
  pages =	{24:1--24:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-100-9},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{126},
  editor =	{Niedermeier, Rolf and Paul, Christophe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2019.24},
  URN =		{urn:nbn:de:0030-drops-102639},
  doi =		{10.4230/LIPIcs.STACS.2019.24},
  annote =	{Keywords: Retrieval, Hashing, Succinct Data Structure, Randomised Data Structure, Structured Gaussian Elimination, Method of Four Russians}
}

Document

DOI: 10.4230/LIPIcs.MFCS.2018.59

A Subquadratic Algorithm for 3XOR

Authors: Martin Dietzfelbinger, Philipp Schlag, and Stefan Walzer

Published in: LIPIcs, Volume 117, 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018)

Abstract

Given a set X of n binary words of equal length w, the 3XOR problem asks for three elements a, b, c in X such that a oplus b=c, where oplus denotes the bitwise XOR operation. The problem can be easily solved on a word RAM with word length w in time O(n^2 log n). Using Han's fast integer sorting algorithm (STOC/J. Algorithms, 2002/2004) this can be reduced to O(n^2 log log n). With randomization or a sophisticated deterministic dictionary construction, creating a hash table for X with constant lookup time leads to an algorithm with (expected) running time O(n^2). At present, seemingly no faster algorithms are known. We present a surprisingly simple deterministic, quadratic time algorithm for 3XOR. Its core is a version of the PATRICIA tree for X, which makes it possible to traverse the set a oplus X in ascending order for arbitrary a in {0, 1}^{w} in linear time. Furthermore, we describe a randomized algorithm for 3XOR with expected running time O(n^2 * min{log^3(w)/w, (log log n)^2/log^2 n}). The algorithm transfers techniques to our setting that were used by Baran, Demaine, and Patrascu (WADS/Algorithmica, 2005/2008) for solving the related int3SUM problem (the same problem with integer addition in place of binary XOR) in expected time o(n^2). As suggested by Jafargholi and Viola (Algorithmica, 2016), linear hash functions are employed. The latter authors also showed that assuming 3XOR needs expected running time n^(2-o(1)) one can prove conditional lower bounds for triangle enumeration just as with 3SUM. We demonstrate that 3XOR can be reduced to other problems as well, treating the examples offline SetDisjointness and offline SetIntersection, which were studied for 3SUM by Kopelowitz, Pettie, and Porat (SODA, 2016).

Cite as

Martin Dietzfelbinger, Philipp Schlag, and Stefan Walzer. A Subquadratic Algorithm for 3XOR. In 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 117, pp. 59:1-59:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger_et_al:LIPIcs.MFCS.2018.59,
  author =	{Dietzfelbinger, Martin and Schlag, Philipp and Walzer, Stefan},
  title =	{{A Subquadratic Algorithm for 3XOR}},
  booktitle =	{43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018)},
  pages =	{59:1--59:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-086-6},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{117},
  editor =	{Potapov, Igor and Spirakis, Paul and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2018.59},
  URN =		{urn:nbn:de:0030-drops-96417},
  doi =		{10.4230/LIPIcs.MFCS.2018.59},
  annote =	{Keywords: 3SUM, 3XOR, Randomized Algorithms, Reductions, Conditional Lower Time Bounds}
}

Document

DOI: 10.4230/DagRep.7.5.1

Theory and Applications of Hashing (Dagstuhl Seminar 17181)

Authors: Martin Dietzfelbinger, Michael Mitzenmacher, Rasmus Pagh, David P. Woodruff, and Martin Aumüller

Published in: Dagstuhl Reports, Volume 7, Issue 5 (2018)

Abstract

This report documents the program and the topics discussed of the 4-day Dagstuhl Seminar 17181 "Theory and Applications of Hashing", which took place May 1-5, 2017. Four long and eighteen short talks covered a wide and diverse range of topics within the theme of the workshop. The program left sufficient space for informal discussions among the 40 participants.

Cite as

Martin Dietzfelbinger, Michael Mitzenmacher, Rasmus Pagh, David P. Woodruff, and Martin Aumüller. Theory and Applications of Hashing (Dagstuhl Seminar 17181). In Dagstuhl Reports, Volume 7, Issue 5, pp. 1-21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@Article{dietzfelbinger_et_al:DagRep.7.5.1,
  author =	{Dietzfelbinger, Martin and Mitzenmacher, Michael and Pagh, Rasmus and Woodruff, David P. and Aum\"{u}ller, Martin},
  title =	{{Theory and Applications of Hashing (Dagstuhl Seminar 17181)}},
  pages =	{1--21},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2017},
  volume =	{7},
  number =	{5},
  editor =	{Dietzfelbinger, Martin and Mitzenmacher, Michael and Pagh, Rasmus and Woodruff, David P. and Aum\"{u}ller, Martin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagRep.7.5.1},
  URN =		{urn:nbn:de:0030-drops-82788},
  doi =		{10.4230/DagRep.7.5.1},
  annote =	{Keywords: connections to complexity theory, data streaming applications, hash function construction and analysis, hashing primitives, information retrieval applications, locality-sensitive hashing, machine learning applications}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.STACS.2012.25

On Randomness in Hash Functions (Invited Talk)

Authors: Martin Dietzfelbinger

Published in: LIPIcs, Volume 14, 29th International Symposium on Theoretical Aspects of Computer Science (STACS 2012)

Abstract

In the talk, we shall discuss quality measures for hash functions used in data structures and algorithms, and survey positive and negative results. (This talk is not about cryptographic hash functions.) For the analysis of algorithms involving hash functions, it is often convenient to assume the hash functions used behave fully randomly; in some cases there is no analysis known that avoids this assumption. In practice, one needs to get by with weaker hash functions that can be generated by randomized algorithms. A well-studied range of applications concern realizations of dynamic dictionaries (linear probing, chained hashing, dynamic perfect hashing, cuckoo hashing and its generalizations) or Bloom filters and their variants. A particularly successful and useful means of classification are Carter and Wegman's universal or k-wise independent classes, introduced in 1977. A natural and widely used approach to analyzing an algorithm involving hash functions is to show that it works if a sufficiently strong universal class of hash functions is used, and to substitute one of the known constructions of such classes. This invites research into the question of just how much independence in the hash functions is necessary for an algorithm to work. Some recent analyses that gave impossibility results constructed rather artificial classes that would not work; other results pointed out natural, widely used hash classes that would not work in a particular application. Only recently it was shown that under certain assumptions on some entropy present in the set of keys even 2-wise independent hash classes will lead to strong randomness properties in the hash values. The negative results show that these results may not be taken as justification for using weak hash classes indiscriminately, in particular for key sets with structure. When stronger independence properties are needed for a theoretical analysis, one may resort to classic constructions. Only in 2003 it was found out how full randomness can be simulated using only linear space overhead (which is optimal). The "split-and-share" approach can be used to justify the full randomness assumption in some situations in which full randomness is needed for the analysis to go through, like in many applications involving multiple hash functions (e.g., generalized versions of cuckoo hashing with multiple hash functions or larger bucket sizes, load balancing, Bloom filters and variants, or minimal perfect hash function constructions). For practice, efficiency considerations beyond constant factors are important. It is not hard to construct very efficient 2-wise independent classes. Using k-wise independent classes for constant k bigger than 3 has become feasible in practice only by new constructions involving tabulation. This goes together well with the quite new result that linear probing works with 5-independent hash functions. Recent developments suggest that the classification of hash function constructions by their degree of independence alone may not be adequate in some cases. Thus, one may want to analyze the behavior of specific hash classes in specific applications, circumventing the concept of k-wise independence. Several such results were recently achieved concerning hash functions that utilize tabulation. In particular if the analysis of the application involves using randomness properties in graphs and hypergraphs (generalized cuckoo hashing, also in the version with a "stash", or load balancing), a hash class combining k-wise independence with tabulation has turned out to be very powerful.

Cite as

Martin Dietzfelbinger. On Randomness in Hash Functions (Invited Talk). In 29th International Symposium on Theoretical Aspects of Computer Science (STACS 2012). Leibniz International Proceedings in Informatics (LIPIcs), Volume 14, pp. 25-28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger:LIPIcs.STACS.2012.25,
  author =	{Dietzfelbinger, Martin},
  title =	{{On Randomness in Hash Functions}},
  booktitle =	{29th International Symposium on Theoretical Aspects of Computer Science (STACS 2012)},
  pages =	{25--28},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-35-4},
  ISSN =	{1868-8969},
  year =	{2012},
  volume =	{14},
  editor =	{D\"{u}rr, Christoph and Wilke, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2012.25},
  URN =		{urn:nbn:de:0030-drops-33884},
  doi =		{10.4230/LIPIcs.STACS.2012.25},
  annote =	{Keywords: Algorithms, hash functions, randomized algorithms, data structures, graphs, hypergraphs}
}

Document

DOI: 10.4230/LIPIcs.STACS.2008.1348

Tight Bounds for Blind Search on the Integers

Authors: Martin Dietzfelbinger, Jonathan E. Rowe, Ingo Wegener, and Philipp Woelfel

Published in: LIPIcs, Volume 1, 25th International Symposium on Theoretical Aspects of Computer Science (2008)

Abstract

We analyze a simple random process in which a token is moved in the interval $A={0,dots,n$: Fix a probability distribution $mu$ over ${1,dots,n$. Initially, the token is placed in a random position in $A$. In round $t$, a random value $d$ is chosen according to $mu$. If the token is in position $ageq d$, then it is moved to position $a-d$. Otherwise it stays put. Let $T$ be the number of rounds until the token reaches position 0. We show tight bounds for the expectation of $T$ for the optimal distribution $mu$. More precisely, we show that $min_mu{E_mu(T)=Thetaleft((log n)^2 ight)$. For the proof, a novel potential function argument is introduced. The research is motivated by the problem of approximating the minimum of a continuous function over $[0,1]$ with a ``blind'' optimization strategy.

Cite as

Martin Dietzfelbinger, Jonathan E. Rowe, Ingo Wegener, and Philipp Woelfel. Tight Bounds for Blind Search on the Integers. In 25th International Symposium on Theoretical Aspects of Computer Science. Leibniz International Proceedings in Informatics (LIPIcs), Volume 1, pp. 241-252, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger_et_al:LIPIcs.STACS.2008.1348,
  author =	{Dietzfelbinger, Martin and Rowe, Jonathan E. and Wegener, Ingo and Woelfel, Philipp},
  title =	{{Tight Bounds for Blind Search on the Integers}},
  booktitle =	{25th International Symposium on Theoretical Aspects of Computer Science},
  pages =	{241--252},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-06-4},
  ISSN =	{1868-8969},
  year =	{2008},
  volume =	{1},
  editor =	{Albers, Susanne and Weil, Pascal},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2008.1348},
  URN =		{urn:nbn:de:0030-drops-13486},
  doi =		{10.4230/LIPIcs.STACS.2008.1348},
  annote =	{Keywords: }
}

Document

DOI: 10.4230/DagSemProc.07391.1

07391 Abstracts Collection – Probabilistic Methods in the Design and Analysis of Algorithms

Authors: Martin Dietzfelbinger, Shang-Hua Teng, Eli Upfal, and Berthold Vöcking

Published in: Dagstuhl Seminar Proceedings, Volume 7391, Probabilistic Methods in the Design and Analysis of Algorithms (2007)

Abstract

From 23.09.2007 to 28.09.2007, the Dagstuhl Seminar 07391 "Probabilistic Methods in the Design and Analysis of Algorithms''was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. The seminar brought together leading researchers in probabilistic methods to strengthen and foster collaborations among various areas of Theoretical Computer Science. The interaction between researchers using randomization in algorithm design and researchers studying known algorithms and heuristics in probabilistic models enhanced the research of both groups in developing new complexity frameworks and in obtaining new algorithmic results. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available.

Cite as

Martin Dietzfelbinger, Shang-Hua Teng, Eli Upfal, and Berthold Vöcking. 07391 Abstracts Collection – Probabilistic Methods in the Design and Analysis of Algorithms. In Probabilistic Methods in the Design and Analysis of Algorithms. Dagstuhl Seminar Proceedings, Volume 7391, pp. 1-18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2007)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger_et_al:DagSemProc.07391.1,
  author =	{Dietzfelbinger, Martin and Teng, Shang-Hua and Upfal, Eli and V\"{o}cking, Berthold},
  title =	{{07391 Abstracts Collection – Probabilistic Methods in the Design and Analysis of Algorithms}},
  booktitle =	{Probabilistic Methods in the Design and Analysis of Algorithms},
  pages =	{1--18},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2007},
  volume =	{7391},
  editor =	{Martin Dietzfelbinger and Shang-Hua Teng and Eli Upfal and Berthold V\"{o}cking},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagSemProc.07391.1},
  URN =		{urn:nbn:de:0030-drops-12915},
  doi =		{10.4230/DagSemProc.07391.1},
  annote =	{Keywords: Algorithms, Randomization, Probabilistic analysis, Complexity}
}

Document

DOI: 10.4230/DagSemProc.07391.2

Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization

Authors: Chaitanya Swamy and David Shmoys

Published in: Dagstuhl Seminar Proceedings, Volume 7391, Probabilistic Methods in the Design and Analysis of Algorithms (2007)

Abstract

Stochastic optimization problems provide a means to model uncertainty in the input data where the uncertainty is modeled by a probability distribution over the possible realizations of the data. We consider a broad class of these problems, called {it multi-stage stochastic programming problems with recourse}, where the uncertainty evolves through a series of stages and one take decisions in each stage in response to the new information learned. These problems are often computationally quite difficult with even very specialized (sub)problems being $#P$-complete. We obtain the first fully polynomial randomized approximation scheme (FPRAS) for a broad class of multi-stage stochastic linear programming problems with any constant number of stages, without placing any restrictions on the underlying probability distribution or on the cost structure of the input. For any fixed $k$, for a rich class of $k$-stage stochastic linear programs (LPs), we show that, for any probability distribution, for any $epsilon>0$, one can compute, with high probability, a solution with expected cost at most $(1+e)$ times the optimal expected cost, in time polynomial in the input size, $frac{1}{epsilon}$, and a parameter $lambda$ that is an upper bound on the cost-inflation over successive stages. Moreover, the algorithm analyzed is a simple and intuitive algorithm that is often used in practice, the {it sample average approximation} (SAA) method. In this method, one draws certain samples from the underlying distribution, constructs an approximate distribution from these samples, and solves the stochastic problem given by this approximate distribution. This is the first result establishing that the SAA method yields near-optimal solutions for (a class of) multi-stage programs with a polynomial number of samples. As a corollary of this FPRAS, by adapting a generic rounding technique of Shmoys and Swamy, we also obtain the first approximation algorithms for the analogous class of multi-stage stochastic integer programs, which includes the multi-stage versions of the set cover, vertex cover, multicut on trees, facility location, and multicommodity flow problems.

Cite as

Chaitanya Swamy and David Shmoys. Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization. In Probabilistic Methods in the Design and Analysis of Algorithms. Dagstuhl Seminar Proceedings, Volume 7391, pp. 1-24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2007)

Copy BibTex To Clipboard

@InProceedings{swamy_et_al:DagSemProc.07391.2,
  author =	{Swamy, Chaitanya and Shmoys, David},
  title =	{{Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization}},
  booktitle =	{Probabilistic Methods in the Design and Analysis of Algorithms},
  pages =	{1--24},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2007},
  volume =	{7391},
  editor =	{Martin Dietzfelbinger and Shang-Hua Teng and Eli Upfal and Berthold V\"{o}cking},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagSemProc.07391.2},
  URN =		{urn:nbn:de:0030-drops-12906},
  doi =		{10.4230/DagSemProc.07391.2},
  annote =	{Keywords: Stochastic optimization, approximation algorithms, randomized algorithms, linear programming}
}

Document

DOI: 10.4230/DagSemProc.07391.3

Smoothed Analysis of Binary Search Trees and Quicksort Under Additive Noise

Authors: Bodo Manthey and Till Tantau

Published in: Dagstuhl Seminar Proceedings, Volume 7391, Probabilistic Methods in the Design and Analysis of Algorithms (2007)

Abstract

While the height of binary search trees is linear in the worst case, their average height is logarithmic. We investigate what happens in between, i.e., when the randomness is limited, by analyzing the smoothed height of binary search trees: Randomly perturb a given (adversarial) sequence and then take the expected height of the binary search tree generated by the resulting sequence. As perturbation models, we consider partial permutations, where some elements are randomly permuted, and additive noise, where random numbers are added to the adversarial sequence. We prove tight bounds for the smoothed height of binary search trees under these models. We also obtain tight bounds for smoothed number of left-to-right maxima. Furthermore, we exploit the results obtained to get bounds for the smoothed number of comparisons that quicksort needs.

Cite as

Bodo Manthey and Till Tantau. Smoothed Analysis of Binary Search Trees and Quicksort Under Additive Noise. In Probabilistic Methods in the Design and Analysis of Algorithms. Dagstuhl Seminar Proceedings, Volume 7391, pp. 1-19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2007)

Copy BibTex To Clipboard

@InProceedings{manthey_et_al:DagSemProc.07391.3,
  author =	{Manthey, Bodo and Tantau, Till},
  title =	{{Smoothed Analysis of Binary Search Trees and Quicksort Under Additive Noise}},
  booktitle =	{Probabilistic Methods in the Design and Analysis of Algorithms},
  pages =	{1--19},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2007},
  volume =	{7391},
  editor =	{Martin Dietzfelbinger and Shang-Hua Teng and Eli Upfal and Berthold V\"{o}cking},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagSemProc.07391.3},
  URN =		{urn:nbn:de:0030-drops-12893},
  doi =		{10.4230/DagSemProc.07391.3},
  annote =	{Keywords: Smoothed Analysis, Binary Search Trees, Quicksort, Left-to-right Maxima}
}

14 Search Results for "Dietzfelbinger, Martin"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message