Round-Hashing for Data Storage: Distributed Servers and External-Memory Tables

Authors Roberto Grossi, Luca Versari



PDF
Thumbnail PDF

File

LIPIcs.ESA.2018.43.pdf
  • Filesize: 0.58 MB
  • 14 pages

Document Identifiers

Author Details

Roberto Grossi
  • Dipartimento di Informatica, Università di Pisa, Italy
Luca Versari
  • Dipartimento di Informatica, Università di Pisa, Italy

Cite As Get BibTex

Roberto Grossi and Luca Versari. Round-Hashing for Data Storage: Distributed Servers and External-Memory Tables. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 43:1-43:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018) https://doi.org/10.4230/LIPIcs.ESA.2018.43

Abstract

This paper proposes round-hashing, which is suitable for data storage on distributed servers and for implementing external-memory tables in which each lookup retrieves at most one single block of external memory, using a stash. For data storage, round-hashing is like consistent hashing as it avoids a full rehashing of the keys when new servers are added. Experiments show that the speed to serve requests is tenfold or more than the state of the art. In distributed data storage, this guarantees better throughput for serving requests and, moreover, greatly reduces decision times for which data should move to new servers as rescanning data is much faster.

Subject Classification

ACM Subject Classification
  • Theory of computation → Sorting and searching
  • Information systems → Data dictionaries
Keywords
  • consistent hashing
  • external memory
  • hash tables

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Alok Aggarwal and Jeffrey Scott Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1116-1127, sep 1988. Google Scholar
  2. A. Andersson, P. B. Miltersen, S. Riis, and M. Thorup. Static dictionaries on AC⁰ RAMs: query time θ(√log n/log log n) is necessary and sufficient. In Proceedings of 37th Conference on Foundations of Computer Science, pages 441-450, Oct 1996. URL: http://dx.doi.org/10.1109/SFCS.1996.548503.
  3. Noa Bar-Yosef and Avishai Wool. Remote algorithmic complexity attacks against randomized hash tables. In SECRYPT 2007, Proceedings of the International Conference on Security and Cryptography, Barcelona, Spain, July 28-13, 2007, SECRYPT is part of ICETE - The International Joint Conference on e-Business and Telecommunications, pages 117-124, 2007. Google Scholar
  4. Andrej Brodnik, Peter Bro Miltersen, and J. Ian Munro. Trans-dichotomous algorithms without multiplication - some upper and lower bounds. In Algorithms and Data Structures, 5th International Workshop, WADS '97, Halifax, Nova Scotia, Canada, August 6-8, 1997, Proceedings, pages 426-439, 1997. Google Scholar
  5. F. Cesarini and G. Soda. Single access hashing with overflow separators for dynamic files. BIT Numerical Mathematics, 33(1):15-28, Mar 1993. URL: http://dx.doi.org/10.1007/BF01990340.
  6. Alexander Conway, Martin Farach-Colton, and Philip Shilane. Optimal hashing in external memory. In Proc. Automata, Languages and Programming, 45th International Colloquium (ICALP18), 2018. Google Scholar
  7. Scott A. Crosby and Dan S. Wallach. Denial of service via algorithmic complexity attacks. In USENIX Security Symposium. USENIX Association, 2003. Google Scholar
  8. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: amazon’s highly available key-value store. In Proceedings of the 21st ACM Symposium on Operating Systems Principles 2007, SOSP 2007, Stevenson, Washington, USA, October 14-17, 2007, pages 205-220, 2007. Google Scholar
  9. Ronald Fagin, Jurg Nievergelt, Nicholas Pippenger, and H. Raymond Strong. Extendible hashing - a fast access method for dynamic files. ACM Trans. Database Syst., 4(3):315-344, 1979. URL: http://dx.doi.org/10.1145/320083.320092.
  10. T. Granlund and P. L. Montgomery. Division by invariant integers using multiplication. ACM Sigplan Notices, 29:61-72, 1994. Google Scholar
  11. Intel Co. Intel 64 and IA-32 Architectures Optimization Reference Manual. Intel, order 248966-040, April 2018. Google Scholar
  12. Morten Skaarup Jensen and Rasmus Pagh. Optimality in external memory hashing. Algorithmica, 52(3):403-411, 2008. Google Scholar
  13. David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing, STOC '97, pages 654-663, New York, NY, USA, 1997. ACM. URL: http://dx.doi.org/10.1145/258533.258660.
  14. John Lamping and Eric Veach. A fast, minimal memory, consistent hash algorithm. CoRR, abs/1406.2294, 2014. URL: http://arxiv.org/abs/1406.2294.
  15. Per-Åke Larson. Linear hashing with partial expansions. In VLDB, volume 6, pages 224-232, 1980. Google Scholar
  16. Witold Litwin. Linear hashing: a new tool for file and table addressing. In VLDB, volume 80, pages 1-3, 1980. Google Scholar
  17. Harry G Mairson. The program complexity of searching a table. In Foundations of Computer Science, 1983., 24th Annual Symposium on, pages 40-47. IEEE, 1983. Google Scholar
  18. Peter Bro Miltersen. Lower bounds for static dictionaries on RAMs with bit operations but no multiplication. In Automata, Languages and Programming, 23rd International Colloquium, ICALP96, Paderborn, Germany, 8-12 July 1996, Proceedings, pages 442-453, 1996. Google Scholar
  19. Vahab Mirrokni, Mikkel Thorup, and Morteza Zadimoghaddam. Consistent hashing with bounded loads. CoRR, 2016. URL: https://arxiv.org/pdf/1608.01350v1.
  20. Rasmus Pagh. Basic external memory data structures. Algorithms for Memory Hierarchies, pages 14-35, 2003. Google Scholar
  21. Salvatore Pontarelli, Pedro Reviriego, and Michael Mitzenmacher. EMOMA: exact match in one memory access. CoRR, abs/1709.04711, 2017. URL: http://arxiv.org/abs/1709.04711.
  22. M. V. Ramakrishna and Walid R. Tout. Dynamic external hashing with guaranteed single access retrieval, pages 187-201. Springer Berlin Heidelberg, Berlin, Heidelberg, 1989. URL: http://dx.doi.org/10.1007/3-540-51295-0_127.
  23. Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, and Hari Balakrishnan. Chord: a scalable peer-to-peer lookup protocol for Internet applications. IEEE/ACM Transactions on Networking, 11(1):17-32, 2003. Google Scholar
  24. David Thaler and Chinya V. Ravishankar. Using name-based mappings to increase hit rates. IEEE/ACM Trans. Netw, 6(1):1-14, 1998. Google Scholar
  25. Sebastiano Vigna. xoroshiro128+: an extremely fast and well-distributed pseudo random number generator. http://xoroshiro.di.unimi.it/, 2018.
  26. Wei Wang and Chinya V. Ravishankar. Hash-based virtual hierarchies for scalable location service in mobile ad-hoc networks. Mobile Networks and Applications, 14(5):625-637, 2009. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail