Power of d Choices with Simple Tabulation

Aamand, Anders; Bæk Tejs Knudsen, Mathias; Thorup, Mikkel

doi:10.4230/LIPIcs.ICALP.2018.5

File

LIPIcs.ICALP.2018.5.pdf

Filesize: 0.88 MB
14 pages

Document Identifiers

DOI: 10.4230/LIPIcs.ICALP.2018.5
URN: urn:nbn:de:0030-drops-90096

Author Details

Anders Aamand

BARC, University of Copenhagen, Universitetsparken 1, Copenhagen, Denmark.

Mathias Bæk Tejs Knudsen

University of Copenhagen and Supwiz, Copenhagen, Denmark.

Mikkel Thorup

BARC, University of Copenhagen, Universitetsparken 1, Copenhagen, Denmark.

Cite AsGet BibTex

Anders Aamand, Mathias Bæk Tejs Knudsen, and Mikkel Thorup. Power of d Choices with Simple Tabulation. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 5:1-5:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.ICALP.2018.5

Abstract

We consider the classic d-choice paradigm of Azar et al. [STOC'94] in which m balls are put into n bins sequentially as follows: For each ball we are given a choice of d bins chosen according to d hash functions and the ball is placed in the least loaded of these bins, breaking ties arbitrarily. The interest is in the number of balls in the fullest bin after all balls have been placed. In this paper we suppose that the d hash functions are simple tabulation hash functions which are easy to implement and can be evaluated in constant time. Generalising a result by Dahlgaard et al. [SODA'16] we show that for an arbitrary constant d >= 2 the expected maximum load is at most (lg lg n)/(lg d) + O(1). We further show that by using a simple tie-breaking algorithm introduced by Vöcking [J.ACM'03] the expected maximum load is reduced to (lg lg n)/(d lg phi_d) + O(1) where phi_d is the rate of growth of the d-ary Fibonacci numbers. Both of these expected bounds match those known from the fully random setting. The analysis by Dahlgaard et al. relies on a proof by Patrascu and Thorup [J.ACM'11] concerning the use of simple tabulation for cuckoo hashing. We require a generalisation to d>2 hash functions, but the original proof is an 8-page tour de force of ad-hoc arguments that do not appear to generalise. Our main technical contribution is a shorter, simpler and more accessible proof of the result by Patrascu and Thorup, where the relevant parts generalise nicely to the analysis of d choices.

Subject Classification

ACM Subject Classification

Theory of computation → Pseudorandomness and derandomization
Mathematics of computing → Random graphs
Mathematics of computing → Probabilistic algorithms
Theory of computation → Online algorithms
Theory of computation → Data structures design and analysis
Theory of computation → Bloom filters and hashing

Keywords

Hashing
Load Balancing
Balls and Bins
Simple Tabulation

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Anders Aamand, Mathias Bæk Tejs Knudsen, and Mikkel Thorup. Power of d choices with simple tabulation. CoRR, abs/1804.09684, 2018. URL: https://arxiv.org/abs/1804.09684.
Yossi Azar, Andrei Z. Broder, Anna R. Karlin, and Eli Upfal. Balanced allocations. SIAM Journal of Computation, 29(1):180-200, 1999. See also STOC'94.
Larry Carter and Mark N. Wegman. Universal classes of hash functions. Journal of Computer and System Sciences, 18(2):143-154, 1979. See also STOC'77.
L. Elisa Celis, Omer Reingold, Gil Segev, and Udi Wieder. Balls and bins: Smaller hash families and faster evaluation. In IEEE 52nd Symposium on Foundations of Computer Science, FOCS, pages 599-608, 2011.
Xue Chen. Derandomized balanced allocation. CoRR, abs/1702.03375, 2017. Preprint. URL: http://arxiv.org/abs/1702.03375,
Søren Dahlgaard, Mathias Bæk Tejs Knudsen, Eva Rotenberg, and Mikkel Thorup. Hashing for statistics over k-partitions. In Proc. 56th Symposium on Foundations of Computer Science, FOCS, pages 1292-1310, 2015.
Søren Dahlgaard, Mathias Bæk Tejs Knudsen, Eva Rotenberg, and Mikkel Thorup. The power of two choices with simple tabulation. In Proc. 27. ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 1631-1642, 2016.
Martin Dietzfelbinger, Torben Hagerup, Jyrki Katajainen, and Martti Penttonen. A reliable randomized algorithm for the closest-pair problem. Journal of Algorithms, 25(1):19-51, 1997. URL: http://dx.doi.org/10.1006/jagm.1997.0873.
Martin Dietzfelbinger and Philipp Woelfel. Almost random graphs with simple hash functions. In Proc. 35th ACM Symposium on Theory of Computing, STOC, pages 629-638, 2003. URL: http://dx.doi.org/10.1145/780542.780634.
Gaston H. Gonnet. Expected length of the longest probe sequence in hash code searching. Journal of the ACM, 28(2):289-304, 1981.
Michael Mitzenmacher. The power of two choices in randomized load balancing. IEEE Transactions on Parallel and Distribed Systems, 12(10):1094-1104, 2001.
Michael Mitzenmacher, Andrea W. Richa, and Ramesh Sitaraman. The power of two random choices: A survey of techniques and results. Handbook of Randomized Computing, 1:255-312, 2001.
Michael Mitzenmacher and Eli Upfal. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, New York, NY, USA, 2005.
Rasmus Pagh and Flemming F. Rodler. Cuckoo hashing. Journal of Algorithms, 51(2):122-144, 2004. See also ESA'01.
Mihai Pǎtraşcu and Mikkel Thorup. The power of simple tabulation hashing. Journal of the ACM, 59(3):14:1-14:50, 2012. Announced at STOC'11.
Omer Reingold, Ron D. Rothblum, and Udi Wieder. Pseudorandom graphs in data structures. In Proc. 41st International Colloquium on Automata, Languages and Programming, ICALP, pages 943-954, 2014. URL: http://dx.doi.org/10.1007/978-3-662-43948-7_78.
Alan Siegel. On universal classes of extremely random constant-time hash functions. SIAM Journal of Computing, 33(3):505-543, 2004. See also FOCS'89.
Mikkel Thorup. Simple tabulation, fast expanders, double tabulation, and high independence. In Proc. 54th Symposium on Foundations of Computer Science, FOCS, pages 90-99, 2013.
Mikkel Thorup and Yin Zhang. Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation. SIAM Journal of Computing, 41(2):293-331, apr 2012. Announced at SODA'04 and ALENEX'10.
Berthold Vöcking. How asymmetry helps load balancing. Journal of the ACM, 50(4):568-589, 2003. See also FOCS'99.
Udi Wieder. Hashing, load balancing and multiple choice. Foundations and Trends in Theoretical Computer Science, 12(3-4):275-379, 2017. URL: http://dx.doi.org/10.1561/0400000070.
Philipp Woelfel. Asymmetric balanced allocation with simple hash functions. In Proc. 17th ACM-SIAM Symposium on Discrete Algorithm, SODA, pages 424-433, 2006.
Albert L. Zobrist. A new hashing method with application for game playing. Technical report, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, 1970.

Power of d Choices with Simple Tabulation

Authors Anders Aamand , Mathias Bæk Tejs Knudsen , Mikkel Thorup

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Power of d Choices with Simple Tabulation

Authors Anders Aamand , Mathias Bæk Tejs Knudsen , Mikkel Thorup

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message