Document

**Published in:** LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)

A minimal perfect hash function (or MPHF) maps a set of n keys to [n] : = {1, …, n} without collisions. Such functions find widespread application e.g. in bioinformatics and databases. In this paper we revisit PTHash - a construction technique particularly designed for fast queries. PTHash distributes the input keys into small buckets and, for each bucket, it searches for a hash function seed that places its keys in the output domain without collisions. The collection of all seeds is then stored in a compressed way. Since the first buckets are easier to place, buckets are considered in non-increasing order of size. Additionally, PTHash heuristically produces an imbalanced distribution of bucket sizes by distributing 60% of the keys into 30% of the buckets.
Our main contribution is to characterize, up to lower order terms, an optimal choice for the expected bucket sizes, improving construction throughput for space efficient configurations both in theory and practice. Further contributions include a new encoding scheme for seeds that works across partitions of the data structure and a GPU parallelization.
Compared to PTHash, PHOBIC is 0.17 bits/key more space efficient for same query time and construction throughput. For a configuration with fast queries, our GPU implementation can construct an MPHF at 2.17 bits/key in 28 ns/key, which can be queried in 37 ns/query on the CPU.

Stefan Hermann, Hans-Peter Lehmann, Giulio Ermanno Pibiri, Peter Sanders, and Stefan Walzer. PHOBIC: Perfect Hashing With Optimized Bucket Sizes and Interleaved Coding. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 69:1-69:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{hermann_et_al:LIPIcs.ESA.2024.69, author = {Hermann, Stefan and Lehmann, Hans-Peter and Pibiri, Giulio Ermanno and Sanders, Peter and Walzer, Stefan}, title = {{PHOBIC: Perfect Hashing With Optimized Bucket Sizes and Interleaved Coding}}, booktitle = {32nd Annual European Symposium on Algorithms (ESA 2024)}, pages = {69:1--69:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-338-6}, ISSN = {1868-8969}, year = {2024}, volume = {308}, editor = {Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.69}, URN = {urn:nbn:de:0030-drops-211405}, doi = {10.4230/LIPIcs.ESA.2024.69}, annote = {Keywords: Compressed Data Structures, Minimal Perfect Hashing, GPU} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

We consider the problem of reconstructing the symmetric difference between similar sets from their representations (sketches) of size linear in the number of differences. Exact solutions to this problem are based on error-correcting coding techniques and suffer from a large decoding time. Existing probabilistic solutions based on Invertible Bloom Lookup Tables (IBLTs) are time-efficient but offer insufficient success guarantees for many applications. Here we propose a tunable trade-off between the two approaches combining the efficiency of IBLTs with exponentially decreasing failure probability. The proof relies on a refined analysis of IBLTs proposed in (Bæk Tejs Houen et al. SOSA 2023) which has an independent interest. We also propose a modification of our algorithm that enables telling apart the elements of each set in the symmetric difference.

Djamal Belazzougui, Gregory Kucherov, and Stefan Walzer. Better Space-Time-Robustness Trade-Offs for Set Reconciliation. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 20:1-20:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{belazzougui_et_al:LIPIcs.ICALP.2024.20, author = {Belazzougui, Djamal and Kucherov, Gregory and Walzer, Stefan}, title = {{Better Space-Time-Robustness Trade-Offs for Set Reconciliation}}, booktitle = {51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)}, pages = {20:1--20:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-322-5}, ISSN = {1868-8969}, year = {2024}, volume = {297}, editor = {Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.20}, URN = {urn:nbn:de:0030-drops-201639}, doi = {10.4230/LIPIcs.ICALP.2024.20}, annote = {Keywords: data structures, hashing, set reconciliation, invertible Bloom lookup tables, random hypergraphs, BCH codes} }

Document

RANDOM

**Published in:** LIPIcs, Volume 245, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)

The Chinese restaurant process is a stochastic process closely related to the Dirichlet process that groups sequentially arriving objects into a variable number of classes, such that within each class objects are cyclically ordered. A popular description involves a restaurant, where customers arrive one by one and either sit down next to a randomly chosen customer at one of the existing tables or open a new table. The full state of the process after n steps is given by a permutation of the n objects and cannot be represented in sublinear space. In particular, if we only need specific information about a few objects or classes it would be preferable to obtain the answers without simulating the process completely.
A recent line of research [Oded Goldreich et al., 2010; Moni Naor and Asaf Nussboim, 2007; Amartya Shankha Biswas et al., 2020; Guy Even et al., 2021] attempts to provide access to huge random objects without fully instantiating them. Such local access implementations provide answers to a sequence of queries about the random object, following the same distribution as if the object was fully generated. In this paper, we provide a local access implementation for a generalization of the Chinese restaurant process described above. Our implementation can be used to answer any sequence of adaptive queries about class affiliation of objects, number and sizes of classes at any time, position of elements within a class, or founding time of a class. The running time per query is polylogarithmic in the total size of the object, with high probability. Our approach relies on some ideas from the recent local access implementation for preferential attachment trees by Even et al. [Guy Even et al., 2021]. Such trees are related to the Chinese restaurant process in the sense that both involve a "rich-get-richer" phenomenon. A novel ingredient in our implementation is to embed the process in continuous time, in which the evolution of the different classes becomes stochastically independent [Joyce and Tavaré, 1987]. This independence is used to keep the probabilistic structure manageable even if many queries have already been answered. As similar embeddings are available for a wide range of urn processes [Krishna B. Athreya and Samuel Karlin, 1968], we believe that our approach may be applicable more generally. Moreover, local access implementations for birth and death processes that we encounter along the way may be of independent interest.

Peter Mörters, Christian Sohler, and Stefan Walzer. A Sublinear Local Access Implementation for the Chinese Restaurant Process. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 245, pp. 28:1-28:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{morters_et_al:LIPIcs.APPROX/RANDOM.2022.28, author = {M\"{o}rters, Peter and Sohler, Christian and Walzer, Stefan}, title = {{A Sublinear Local Access Implementation for the Chinese Restaurant Process}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)}, pages = {28:1--28:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-249-5}, ISSN = {1868-8969}, year = {2022}, volume = {245}, editor = {Chakrabarti, Amit and Swamy, Chaitanya}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2022.28}, URN = {urn:nbn:de:0030-drops-171500}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2022.28}, annote = {Keywords: Chinese restaurant process, Dirichlet process, sublinear time algorithm, random recursive tree, random permutation, random partition, Ewens distribution, simulation, local access implementation, continuous time embedding} }

Document

**Published in:** LIPIcs, Volume 244, 30th Annual European Symposium on Algorithms (ESA 2022)

Most hash tables have an insertion time of 𝒪(1), often qualified as "expected" and/or "amortised". While insertions into cuckoo hash tables indeed seem to take 𝒪(1) expected time in practice, only polylogarithmic guarantees are proven in all but the simplest of practically relevant cases. Given the widespread use of cuckoo hashing to implement compact dictionaries and Bloom filter alternatives, closing this gap is an important open problem for theoreticians.
In this paper, we show that random walk insertions into cuckoo hash tables take 𝒪(1) expected amortised time when any number k ≥ 3 of hash functions is used and the load factor is below the corresponding peeling threshold (e.g. ≈0.81 for k = 3). To our knowledge, this is the first meaningful guarantee for constant time insertion for cuckoo hashing that works for k ∈ {3,…,9}.
In addition to being useful in its own right, we hope that our key-centred analysis method can be a stepping stone on the path to the true end goal: 𝒪(1) time insertions for all load factors below the load threshold (e.g. ≈0.91 for k = 3).

Stefan Walzer. Insertion Time of Random Walk Cuckoo Hashing below the Peeling Threshold. In 30th Annual European Symposium on Algorithms (ESA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 244, pp. 87:1-87:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{walzer:LIPIcs.ESA.2022.87, author = {Walzer, Stefan}, title = {{Insertion Time of Random Walk Cuckoo Hashing below the Peeling Threshold}}, booktitle = {30th Annual European Symposium on Algorithms (ESA 2022)}, pages = {87:1--87:11}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-247-1}, ISSN = {1868-8969}, year = {2022}, volume = {244}, editor = {Chechik, Shiri and Navarro, Gonzalo and Rotenberg, Eva and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2022.87}, URN = {urn:nbn:de:0030-drops-170250}, doi = {10.4230/LIPIcs.ESA.2022.87}, annote = {Keywords: Cuckoo Hashing, Random Walk, Random Hypergraph, Peeling, Cores} }

Document

**Published in:** LIPIcs, Volume 233, 20th International Symposium on Experimental Algorithms (SEA 2022)

A retrieval data structure for a static function f: S → {0,1}^r supports queries that return f(x) for any x ∈ S. Retrieval data structures can be used to implement a static approximate membership query data structure (AMQ), i.e., a Bloom filter alternative, with false positive rate 2^{-r}. The information-theoretic lower bound for both tasks is r|S| bits. While succinct theoretical constructions using (1+o(1))r|S| bits were known, these could not achieve very small overheads in practice because they have an unfavorable space-time tradeoff hidden in the asymptotic costs or because small overheads would only be reached for physically impossible input sizes. With bumped ribbon retrieval (BuRR), we present the first practical succinct retrieval data structure. In an extensive experimental evaluation BuRR achieves space overheads well below 1% while being faster than most previously used retrieval data structures (typically with space overheads at least an order of magnitude larger) and faster than classical Bloom filters (with space overhead ≥ 44%). This efficiency, including favorable constants, stems from a combination of simplicity, word parallelism, and high locality.
We additionally describe homogeneous ribbon filter AMQs, which are even simpler and faster at the price of slightly larger space overhead.

Peter C. Dillinger, Lorenz Hübschle-Schneider, Peter Sanders, and Stefan Walzer. Fast Succinct Retrieval and Approximate Membership Using Ribbon. In 20th International Symposium on Experimental Algorithms (SEA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 233, pp. 4:1-4:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{dillinger_et_al:LIPIcs.SEA.2022.4, author = {Dillinger, Peter C. and H\"{u}bschle-Schneider, Lorenz and Sanders, Peter and Walzer, Stefan}, title = {{Fast Succinct Retrieval and Approximate Membership Using Ribbon}}, booktitle = {20th International Symposium on Experimental Algorithms (SEA 2022)}, pages = {4:1--4:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-251-8}, ISSN = {1868-8969}, year = {2022}, volume = {233}, editor = {Schulz, Christian and U\c{c}ar, Bora}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2022.4}, URN = {urn:nbn:de:0030-drops-165385}, doi = {10.4230/LIPIcs.SEA.2022.4}, annote = {Keywords: AMQ, Bloom filter, dictionary, linear algebra, randomized algorithm, retrieval data structure, static function data structure, succinct data structure, perfect hashing} }

Document

**Published in:** LIPIcs, Volume 157, 10th International Conference on Fun with Algorithms (FUN 2021) (2020)

Card-based cryptography, as first proposed by den Boer [den Boer, 1989], enables secure multiparty computation using only a deck of playing cards. Many protocols as of yet come with an “honest-but-curious” disclaimer. However, modern cryptography aims to provide security also in the presence of active attackers that deviate from the protocol description. In the few places where authors argue for the active security of their protocols, this is done ad-hoc and restricted to the concrete operations needed, often using additional physical tools, such as envelopes or sliding cover boxes. This paper provides the first systematic approach to active security in card-based protocols.
The main technical contribution concerns shuffling operations. A shuffle randomly permutes the cards according to a well-defined distribution but hides the chosen permutation from the players. We show how the large and natural class of uniform closed shuffles, which are shuffles that select a permutation uniformly at random from a permutation group, can be implemented using only a linear number of helping cards. This ensures that any protocol in the model of Mizuki and Shizuya [Mizuki and Shizuya, 2014] can be realized in an actively secure fashion, as long as it is secure in this abstract model and restricted to uniform closed shuffles. Uniform closed shuffles are already sufficient for securely computing any circuit [Mizuki and Sone, 2009]. In the process, we develop a more concrete model for card-based cryptographic protocols with two players, which we believe to be of independent interest.

Alexander Koch and Stefan Walzer. Foundations for Actively Secure Card-Based Cryptography. In 10th International Conference on Fun with Algorithms (FUN 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 157, pp. 17:1-17:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{koch_et_al:LIPIcs.FUN.2021.17, author = {Koch, Alexander and Walzer, Stefan}, title = {{Foundations for Actively Secure Card-Based Cryptography}}, booktitle = {10th International Conference on Fun with Algorithms (FUN 2021)}, pages = {17:1--17:23}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-145-0}, ISSN = {1868-8969}, year = {2020}, volume = {157}, editor = {Farach-Colton, Martin and Prencipe, Giuseppe and Uehara, Ryuhei}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FUN.2021.17}, URN = {urn:nbn:de:0030-drops-127786}, doi = {10.4230/LIPIcs.FUN.2021.17}, annote = {Keywords: Card-Based Protocols, Card Shuffling, Secure Multiparty Computation, Active Security, Cryptography without Computers} }

Document

**Published in:** LIPIcs, Volume 144, 27th Annual European Symposium on Algorithms (ESA 2019)

We describe a new family of k-uniform hypergraphs with independent random edges. The hypergraphs have a high probability of being peelable, i.e. to admit no sub-hypergraph of minimum degree 2, even when the edge density (number of edges over vertices) is close to 1.
In our construction, the vertex set is partitioned into linearly arranged segments and each edge is incident to random vertices of k consecutive segments. Quite surprisingly, the linear geometry allows our graphs to be peeled “from the outside in”. The density thresholds f_k for peelability of our hypergraphs (f_3 ≈ 0.918, f_4 ≈ 0.977, f_5 ≈ 0.992, …) are well beyond the corresponding thresholds (c_3 ≈ 0.818, c_4 ≈ 0.772, c_5 ≈ 0.702, …) of standard k-uniform random hypergraphs.
To get a grip on f_k, we analyse an idealised peeling process on the random weak limit of our hypergraph family. The process can be described in terms of an operator on [0,1]^ℤ and f_k can be linked to thresholds relating to the operator. These thresholds are then tractable with numerical methods.
Random hypergraphs underlie the construction of various data structures based on hashing, for instance invertible Bloom filters, perfect hash functions, retrieval data structures, error correcting codes and cuckoo hash tables, where inputs are mapped to edges using hash functions. Frequently, the data structures rely on peelability of the hypergraph or peelability allows for simple linear time algorithms. Memory efficiency is closely tied to edge density while worst and average case query times are tied to maximum and average edge size.
To demonstrate the usefulness of our construction, we used our 3-uniform hypergraphs as a drop-in replacement for the standard 3-uniform hypergraphs in a retrieval data structure by Botelho et al. [Fabiano Cupertino Botelho et al., 2013]. This reduces memory usage from 1.23m bits to 1.12m bits (m being the input size) with almost no change in running time. Using k > 3 attains, at small sacrifices in running time, further improvements to memory usage.

Martin Dietzfelbinger and Stefan Walzer. Dense Peelable Random Uniform Hypergraphs. In 27th Annual European Symposium on Algorithms (ESA 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 144, pp. 38:1-38:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger_et_al:LIPIcs.ESA.2019.38, author = {Dietzfelbinger, Martin and Walzer, Stefan}, title = {{Dense Peelable Random Uniform Hypergraphs}}, booktitle = {27th Annual European Symposium on Algorithms (ESA 2019)}, pages = {38:1--38:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-124-5}, ISSN = {1868-8969}, year = {2019}, volume = {144}, editor = {Bender, Michael A. and Svensson, Ola and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2019.38}, URN = {urn:nbn:de:0030-drops-111599}, doi = {10.4230/LIPIcs.ESA.2019.38}, annote = {Keywords: Random Hypergraphs, Peeling Threshold, 2-Core, Hashing, Retrieval, Succinct Data Structure} }

Document

**Published in:** LIPIcs, Volume 144, 27th Annual European Symposium on Algorithms (ESA 2019)

In this paper we identify a new class of sparse near-quadratic random Boolean matrices that have full row rank over F_2 = {0,1} with high probability and can be transformed into echelon form in almost linear time by a simple version of Gauss elimination. The random matrix with dimensions n(1-epsilon) x n is generated as follows: In each row, identify a block of length L = O((log n)/epsilon) at a random position. The entries outside the block are 0, the entries inside the block are given by fair coin tosses. Sorting the rows according to the positions of the blocks transforms the matrix into a kind of band matrix, on which, as it turns out, Gauss elimination works very efficiently with high probability. For the proof, the effects of Gauss elimination are interpreted as a ("coin-flipping") variant of Robin Hood hashing, whose behaviour can be captured in terms of a simple Markov model from queuing theory. Bounds for expected construction time and high success probability follow from results in this area. They readily extend to larger finite fields in place of F_2.
By employing hashing, this matrix family leads to a new implementation of a retrieval data structure, which represents an arbitrary function f: S -> {0,1} for some set S of m = (1-epsilon)n keys. It requires m/(1-epsilon) bits of space, construction takes O(m/epsilon^2) expected time on a word RAM, while queries take O(1/epsilon) time and access only one contiguous segment of O((log m)/epsilon) bits in the representation (O(1/epsilon) consecutive words on a word RAM). The method is readily implemented and highly practical, and it is competitive with state-of-the-art methods. In a more theoretical variant, which works only for unrealistically large S, we can even achieve construction time O(m/epsilon) and query time O(1), accessing O(1) contiguous memory words for a query. By well-established methods the retrieval data structure leads to efficient constructions of (static) perfect hash functions and (static) Bloom filters with almost optimal space and very local storage access patterns for queries.

Martin Dietzfelbinger and Stefan Walzer. Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications. In 27th Annual European Symposium on Algorithms (ESA 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 144, pp. 39:1-39:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger_et_al:LIPIcs.ESA.2019.39, author = {Dietzfelbinger, Martin and Walzer, Stefan}, title = {{Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications}}, booktitle = {27th Annual European Symposium on Algorithms (ESA 2019)}, pages = {39:1--39:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-124-5}, ISSN = {1868-8969}, year = {2019}, volume = {144}, editor = {Bender, Michael A. and Svensson, Ola and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2019.39}, URN = {urn:nbn:de:0030-drops-111602}, doi = {10.4230/LIPIcs.ESA.2019.39}, annote = {Keywords: Random Band Matrix, Gauss Elimination, Retrieval, Hashing, Succinct Data Structure, Randomised Data Structure, Robin Hood Hashing, Bloom Filter} }

Document

**Published in:** LIPIcs, Volume 126, 36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019)

For a set U (the universe), retrieval is the following problem. Given a finite subset S subseteq U of size m and f : S -> {0,1}^r for a small constant r, build a data structure D_f with the property that for a suitable query algorithm query we have query(D_f,x) = f(x) for all x in S. For x in U setminus S the value query(D_f,x) is arbitrary in {0,1}^r. The number of bits needed for D_f should be (1+epsilon)r m with overhead epsilon = epsilon(m) >= 0 as small as possible, while the query time should be small. Of course, the time for constructing D_f is relevant as well.
We assume fully random hash functions on U with constant evaluation time are available. It is known that with epsilon ~= 0.09 one can achieve linear construction time and constant query time, and with overhead epsilon_k ~= e^{-k} it is possible to have O(k) query time and O(m^{1+alpha}) construction time, for arbitrary alpha>0. Furthermore, a theoretical construction with epsilon =O((log log m)/sqrt{log m}) gives constant query time and linear construction time. Known constructions avoiding all overhead, except for a seed value of size O(log log m), require logarithmic query time.
In this paper, we present a method for treating the retrieval problem with overhead epsilon = O((log m)/m), which corresponds to O(1) extra memory words (O(log m) bits), and an extremely simple, constant-time query operation. The price to pay is a construction time of O(m^2). We employ the usual framework for retrieval data structures, where construction is effected by solving a sparse linear system of equations over the 2-element field F_2 and a query is effected by a dot product calculation. Our main technical contribution is the design and analysis of a new and natural family of sparse random linear systems with m equations and (1+epsilon)m variables, which combines good locality properties with high probability of having full rank.
Paying a larger overhead of epsilon = O((log m)/m^alpha), the construction time can be reduced to O(m^{1+alpha}) for arbitrary constant 0 < alpha < 1. In combination with an adaptation of known techniques for solving sparse linear systems of equations, our approach leads to a highly practical algorithm for retrieval. In a particular benchmark with m = 10^7 we achieve an order-of-magnitude improvement over previous techniques with epsilon = 0.24% instead of the previously best result of epsilon ~= 3%, with better query time and no significant sacrifices in construction time.

Martin Dietzfelbinger and Stefan Walzer. Constant-Time Retrieval with O(log m) Extra Bits. In 36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 126, pp. 24:1-24:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger_et_al:LIPIcs.STACS.2019.24, author = {Dietzfelbinger, Martin and Walzer, Stefan}, title = {{Constant-Time Retrieval with O(log m) Extra Bits}}, booktitle = {36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019)}, pages = {24:1--24:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-100-9}, ISSN = {1868-8969}, year = {2019}, volume = {126}, editor = {Niedermeier, Rolf and Paul, Christophe}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2019.24}, URN = {urn:nbn:de:0030-drops-102639}, doi = {10.4230/LIPIcs.STACS.2019.24}, annote = {Keywords: Retrieval, Hashing, Succinct Data Structure, Randomised Data Structure, Structured Gaussian Elimination, Method of Four Russians} }

Document

**Published in:** LIPIcs, Volume 117, 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018)

Given a set X of n binary words of equal length w, the 3XOR problem asks for three elements a, b, c in X such that a oplus b=c, where oplus denotes the bitwise XOR operation. The problem can be easily solved on a word RAM with word length w in time O(n^2 log n). Using Han's fast integer sorting algorithm (STOC/J. Algorithms, 2002/2004) this can be reduced to O(n^2 log log n). With randomization or a sophisticated deterministic dictionary construction, creating a hash table for X with constant lookup time leads to an algorithm with (expected) running time O(n^2). At present, seemingly no faster algorithms are known.
We present a surprisingly simple deterministic, quadratic time algorithm for 3XOR. Its core is a version of the PATRICIA tree for X, which makes it possible to traverse the set a oplus X in ascending order for arbitrary a in {0, 1}^{w} in linear time. Furthermore, we describe a randomized algorithm for 3XOR with expected running time O(n^2 * min{log^3(w)/w, (log log n)^2/log^2 n}). The algorithm transfers techniques to our setting that were used by Baran, Demaine, and Patrascu (WADS/Algorithmica, 2005/2008) for solving the related int3SUM problem (the same problem with integer addition in place of binary XOR) in expected time o(n^2). As suggested by Jafargholi and Viola (Algorithmica, 2016), linear hash functions are employed.
The latter authors also showed that assuming 3XOR needs expected running time n^(2-o(1)) one can prove conditional lower bounds for triangle enumeration just as with 3SUM. We demonstrate that 3XOR can be reduced to other problems as well, treating the examples offline SetDisjointness and offline SetIntersection, which were studied for 3SUM by Kopelowitz, Pettie, and Porat (SODA, 2016).

Martin Dietzfelbinger, Philipp Schlag, and Stefan Walzer. A Subquadratic Algorithm for 3XOR. In 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 117, pp. 59:1-59:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{dietzfelbinger_et_al:LIPIcs.MFCS.2018.59, author = {Dietzfelbinger, Martin and Schlag, Philipp and Walzer, Stefan}, title = {{A Subquadratic Algorithm for 3XOR}}, booktitle = {43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018)}, pages = {59:1--59:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-086-6}, ISSN = {1868-8969}, year = {2018}, volume = {117}, editor = {Potapov, Igor and Spirakis, Paul and Worrell, James}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2018.59}, URN = {urn:nbn:de:0030-drops-96417}, doi = {10.4230/LIPIcs.MFCS.2018.59}, annote = {Keywords: 3SUM, 3XOR, Randomized Algorithms, Reductions, Conditional Lower Time Bounds} }

Document

**Published in:** LIPIcs, Volume 107, 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)

Stefan Walzer. Load Thresholds for Cuckoo Hashing with Overlapping Blocks. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 102:1-102:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{walzer:LIPIcs.ICALP.2018.102, author = {Walzer, Stefan}, title = {{Load Thresholds for Cuckoo Hashing with Overlapping Blocks}}, booktitle = {45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)}, pages = {102:1--102:10}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-076-7}, ISSN = {1868-8969}, year = {2018}, volume = {107}, editor = {Chatzigiannakis, Ioannis and Kaklamanis, Christos and Marx, D\'{a}niel and Sannella, Donald}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2018.102}, URN = {urn:nbn:de:0030-drops-91068}, doi = {10.4230/LIPIcs.ICALP.2018.102}, annote = {Keywords: Cuckoo Hashing, Unaligned Blocks, Hypergraph Orientability, Load Thresholds, Randomised Algorithms} }

Document

**Published in:** LIPIcs, Volume 101, 16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2018)

In k-ary cuckoo hashing, each of cn objects is associated with k random buckets in a hash table of size n. An l-orientation is an assignment of objects to associated buckets such that each bucket receives at most l objects. Several works have determined load thresholds c^* = c^*(k,l) for k-ary cuckoo hashing; that is, for c < c^* an l-orientation exists with high probability, and for c > c^* no l-orientation exists with high probability.
A natural variant of k-ary cuckoo hashing utilizes double hashing, where, when the buckets are numbered 0,1,...,n-1, the k choices of random buckets form an arithmetic progression modulo n. Double hashing simplifies implementation and requires less randomness, and it has been shown that double hashing has the same behavior as fully random hashing in several other data structures that similarly use multiple hashes for each object. Interestingly, previous work has come close to but has not fully shown that the load threshold for k-ary cuckoo hashing is the same when using double hashing as when using fully random hashing. Specifically, previous work has shown that the thresholds for both settings coincide, except that for double hashing it was possible that o(n) objects would have been left unplaced. Here we close this open question by showing the thresholds are indeed the same, by providing a combinatorial argument that reconciles this stubborn difference.

Michael Mitzenmacher, Konstantinos Panagiotou, and Stefan Walzer. Load Thresholds for Cuckoo Hashing with Double Hashing. In 16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 101, pp. 29:1-29:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{mitzenmacher_et_al:LIPIcs.SWAT.2018.29, author = {Mitzenmacher, Michael and Panagiotou, Konstantinos and Walzer, Stefan}, title = {{Load Thresholds for Cuckoo Hashing with Double Hashing}}, booktitle = {16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2018)}, pages = {29:1--29:9}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-068-2}, ISSN = {1868-8969}, year = {2018}, volume = {101}, editor = {Eppstein, David}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SWAT.2018.29}, URN = {urn:nbn:de:0030-drops-88557}, doi = {10.4230/LIPIcs.SWAT.2018.29}, annote = {Keywords: Cuckoo Hashing, Double Hashing, Load Thresholds, Hypergraph Orientability} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail