Power of d Choices with Simple Tabulation

Authors Anders Aamand , Mathias Bæk Tejs Knudsen , Mikkel Thorup



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2018.5.pdf
  • Filesize: 0.88 MB
  • 14 pages

Document Identifiers

Author Details

Anders Aamand
  • BARC, University of Copenhagen, Universitetsparken 1, Copenhagen, Denmark.
Mathias Bæk Tejs Knudsen
  • University of Copenhagen and Supwiz, Copenhagen, Denmark.
Mikkel Thorup
  • BARC, University of Copenhagen, Universitetsparken 1, Copenhagen, Denmark.

Cite As Get BibTex

Anders Aamand, Mathias Bæk Tejs Knudsen, and Mikkel Thorup. Power of d Choices with Simple Tabulation. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 5:1-5:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018) https://doi.org/10.4230/LIPIcs.ICALP.2018.5

Abstract

We consider the classic d-choice paradigm of Azar et al. [STOC'94] in which m balls are put into n bins sequentially as follows: For each ball we are given a choice of d bins chosen according to d hash functions and the ball is placed in the least loaded of these bins, breaking ties arbitrarily. The interest is in the number of balls in the fullest bin after all balls have been placed.
In this paper we suppose that the d hash functions are simple tabulation hash functions which are easy to implement and can be evaluated in constant time. Generalising a result by Dahlgaard et al. [SODA'16] we show that for an arbitrary constant d >= 2 the expected maximum load is at most (lg lg n)/(lg d) + O(1). We further show that by using a simple tie-breaking algorithm introduced by Vöcking [J.ACM'03] the expected maximum load is reduced to (lg lg n)/(d lg phi_d) + O(1) where phi_d is the rate of growth of the d-ary Fibonacci numbers. Both of these expected bounds match those known from the fully random setting.
The analysis by Dahlgaard et al. relies on a proof by Patrascu and Thorup [J.ACM'11] concerning the use of simple tabulation for cuckoo hashing. We require a generalisation to d>2 hash functions, but the original proof is an 8-page tour de force of ad-hoc arguments that do not appear to generalise. Our main technical contribution is a shorter, simpler and more accessible proof of the result by Patrascu and Thorup, where the relevant parts generalise nicely to the analysis of d choices.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pseudorandomness and derandomization
  • Mathematics of computing → Random graphs
  • Mathematics of computing → Probabilistic algorithms
  • Theory of computation → Online algorithms
  • Theory of computation → Data structures design and analysis
  • Theory of computation → Bloom filters and hashing
Keywords
  • Hashing
  • Load Balancing
  • Balls and Bins
  • Simple Tabulation

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Anders Aamand, Mathias Bæk Tejs Knudsen, and Mikkel Thorup. Power of d choices with simple tabulation. CoRR, abs/1804.09684, 2018. URL: https://arxiv.org/abs/1804.09684.
  2. Yossi Azar, Andrei Z. Broder, Anna R. Karlin, and Eli Upfal. Balanced allocations. SIAM Journal of Computation, 29(1):180-200, 1999. See also STOC'94. Google Scholar
  3. Larry Carter and Mark N. Wegman. Universal classes of hash functions. Journal of Computer and System Sciences, 18(2):143-154, 1979. See also STOC'77. Google Scholar
  4. L. Elisa Celis, Omer Reingold, Gil Segev, and Udi Wieder. Balls and bins: Smaller hash families and faster evaluation. In IEEE 52nd Symposium on Foundations of Computer Science, FOCS, pages 599-608, 2011. Google Scholar
  5. Xue Chen. Derandomized balanced allocation. CoRR, abs/1702.03375, 2017. Preprint. URL: http://arxiv.org/abs/1702.03375,
  6. Søren Dahlgaard, Mathias Bæk Tejs Knudsen, Eva Rotenberg, and Mikkel Thorup. Hashing for statistics over k-partitions. In Proc. 56th Symposium on Foundations of Computer Science, FOCS, pages 1292-1310, 2015. Google Scholar
  7. Søren Dahlgaard, Mathias Bæk Tejs Knudsen, Eva Rotenberg, and Mikkel Thorup. The power of two choices with simple tabulation. In Proc. 27. ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 1631-1642, 2016. Google Scholar
  8. Martin Dietzfelbinger, Torben Hagerup, Jyrki Katajainen, and Martti Penttonen. A reliable randomized algorithm for the closest-pair problem. Journal of Algorithms, 25(1):19-51, 1997. URL: http://dx.doi.org/10.1006/jagm.1997.0873.
  9. Martin Dietzfelbinger and Philipp Woelfel. Almost random graphs with simple hash functions. In Proc. 35th ACM Symposium on Theory of Computing, STOC, pages 629-638, 2003. URL: http://dx.doi.org/10.1145/780542.780634.
  10. Gaston H. Gonnet. Expected length of the longest probe sequence in hash code searching. Journal of the ACM, 28(2):289-304, 1981. Google Scholar
  11. Michael Mitzenmacher. The power of two choices in randomized load balancing. IEEE Transactions on Parallel and Distribed Systems, 12(10):1094-1104, 2001. Google Scholar
  12. Michael Mitzenmacher, Andrea W. Richa, and Ramesh Sitaraman. The power of two random choices: A survey of techniques and results. Handbook of Randomized Computing, 1:255-312, 2001. Google Scholar
  13. Michael Mitzenmacher and Eli Upfal. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, New York, NY, USA, 2005. Google Scholar
  14. Rasmus Pagh and Flemming F. Rodler. Cuckoo hashing. Journal of Algorithms, 51(2):122-144, 2004. See also ESA'01. Google Scholar
  15. Mihai Pǎtraşcu and Mikkel Thorup. The power of simple tabulation hashing. Journal of the ACM, 59(3):14:1-14:50, 2012. Announced at STOC'11. Google Scholar
  16. Omer Reingold, Ron D. Rothblum, and Udi Wieder. Pseudorandom graphs in data structures. In Proc. 41st International Colloquium on Automata, Languages and Programming, ICALP, pages 943-954, 2014. URL: http://dx.doi.org/10.1007/978-3-662-43948-7_78.
  17. Alan Siegel. On universal classes of extremely random constant-time hash functions. SIAM Journal of Computing, 33(3):505-543, 2004. See also FOCS'89. Google Scholar
  18. Mikkel Thorup. Simple tabulation, fast expanders, double tabulation, and high independence. In Proc. 54th Symposium on Foundations of Computer Science, FOCS, pages 90-99, 2013. Google Scholar
  19. Mikkel Thorup and Yin Zhang. Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation. SIAM Journal of Computing, 41(2):293-331, apr 2012. Announced at SODA'04 and ALENEX'10. Google Scholar
  20. Berthold Vöcking. How asymmetry helps load balancing. Journal of the ACM, 50(4):568-589, 2003. See also FOCS'99. Google Scholar
  21. Udi Wieder. Hashing, load balancing and multiple choice. Foundations and Trends in Theoretical Computer Science, 12(3-4):275-379, 2017. URL: http://dx.doi.org/10.1561/0400000070.
  22. Philipp Woelfel. Asymmetric balanced allocation with simple hash functions. In Proc. 17th ACM-SIAM Symposium on Discrete Algorithm, SODA, pages 424-433, 2006. Google Scholar
  23. Albert L. Zobrist. A new hashing method with application for game playing. Technical report, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, 1970. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail