Hash & Adjust: Competitive Demand-Aware Consistent Hashing

Authors Arash Pourdamghani , Chen Avin , Robert Sama , Maryam Shiran , Stefan Schmid



PDF
Thumbnail PDF

File

LIPIcs.OPODIS.2024.24.pdf
  • Filesize: 1.11 MB
  • 23 pages

Document Identifiers

Author Details

Arash Pourdamghani
  • TU Berlin, Germany
Chen Avin
  • Ben-Gurion University of the Negev, Beersheba, Israel
Robert Sama
  • University of Vienna, Austria
Maryam Shiran
  • TU Berlin, Germany
Stefan Schmid
  • TU Berlin, Germany
  • Fraunhofer SIT, Berlin, Germany

Cite As Get BibTex

Arash Pourdamghani, Chen Avin, Robert Sama, Maryam Shiran, and Stefan Schmid. Hash & Adjust: Competitive Demand-Aware Consistent Hashing. In 28th International Conference on Principles of Distributed Systems (OPODIS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 324, pp. 24:1-24:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/LIPIcs.OPODIS.2024.24

Abstract

Distributed systems often serve dynamic workloads and resource demands evolve over time. Such a temporal behavior stands in contrast to the static and demand-oblivious nature of most data structures used by these systems. In this paper, we are particularly interested in consistent hashing, a fundamental building block in many large distributed systems. Our work is motivated by the hypothesis that a more adaptive approach to consistent hashing can leverage structure in the demand, and hence improve storage utilization and reduce access time. 
We initiate the study of demand-aware consistent hashing. Our main contribution is H&A, a constant-competitive online algorithm (i.e., it comes with provable performance guarantees over time). H&A is demand-aware and optimizes its internal structure to enable faster access times, while offering a high utilization of storage. We further evaluate H&A empirically.

Subject Classification

ACM Subject Classification
  • Theory of computation → Online algorithms
  • Theory of computation → Data structures design and analysis
Keywords
  • Consistent hashing
  • demand-awareness
  • online algorithms

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Anders Aamand, Jakob Bæk Tejs Knudsen, and Mikkel Thorup. Load balancing with dynamic set of balls and bins. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC '21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 1262-1275. ACM, 2021. URL: https://doi.org/10.1145/3406325.3451107.
  2. Anders Aamand and Mikkel Thorup. Non-empty bins with simple tabulation hashing. In Timothy M. Chan, editor, Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, pages 2498-2512. SIAM, 2019. URL: https://doi.org/10.1137/1.9781611975482.153.
  3. Vamsi Addanki, Maciej Pacut, Arash Pourdamghani, Gábor Rétvári, Stefan Schmid, and Juan Vanerio. Self-adjusting partially ordered lists. In IEEE INFOCOM 2023 - IEEE Conference on Computer Communications, New York City, NY, USA, May 17-20, 2023, pages 1-10. IEEE, 2023. URL: https://doi.org/10.1109/INFOCOM53939.2023.10228937.
  4. Susanne Albers. Online algorithms: a survey. Math. Program., 97(1-2):3-26, 2003. URL: https://doi.org/10.1007/s10107-003-0436-0.
  5. Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. Workload analysis of a large-scale key-value store. In Peter G. Harrison, Martin F. Arlitt, and Giuliano Casale, editors, ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '12, London, United Kingdom, June 11-15, 2012, pages 53-64. ACM, 2012. URL: https://doi.org/10.1145/2254756.2254766.
  6. Avazu clickthrough rate prediction. URL: https://www.kaggle.com/c/avazu-ctr-prediction.
  7. Chen Avin, Manya Ghobadi, Chen Griner, and Stefan Schmid. On the complexity of traffic traces and implications. In ACM SIGMETRICS, 2020. URL: https://doi.org/10.1145/3393691.3394205.
  8. Chen Avin and Stefan Schmid. Toward demand-aware networking: a theory for self-adjusting networks. Comput. Commun. Rev., 48(5):31-40, 2018. URL: https://doi.org/10.1145/3310165.3310170.
  9. Yossi Azar, Andrei Z. Broder, Anna R. Karlin, and Eli Upfal. Balanced allocations (extended abstract). In Frank Thomson Leighton and Michael T. Goodrich, editors, Proceedings of the Twenty-Sixth Annual ACM Symposium on Theory of Computing, 23-25 May 1994, Montréal, Québec, Canada, pages 593-602. ACM, 1994. URL: https://doi.org/10.1145/195058.195412.
  10. Betsy Beyer, Chris Jones, Jennifer Petoff, and Niall Richard Murphy. Site reliability engineering: How Google runs production systems. O'Reilly Media, Inc., 2016. Google Scholar
  11. Allan Borodin and Ran El-Yaniv. Online computation and competitive analysis. Cambridge University Press, 1998. Google Scholar
  12. Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker. Web caching and zipf-like distributions: Evidence and implications. In Proceedings IEEE INFOCOM '99, The Conference on Computer Communications, Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies, The Future Is Now, New York, NY, USA, March 21-25, 1999, pages 126-134. IEEE Computer Society, 1999. URL: https://doi.org/10.1109/INFCOM.1999.749260.
  13. P CAIDA. The caida ucsd anonymized internet traces 2016, 2019. Google Scholar
  14. John Chen, Benjamin Coleman, and Anshumali Shrivastava. Revisiting consistent hashing with bounded loads. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021, pages 3976-3983. AAAI Press, 2021. URL: https://doi.org/10.1609/aaai.v35i5.16517.
  15. Tobias Christiani and Rasmus Pagh. Generating k-independent variables in constant time. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 196-205. IEEE Computer Society, 2014. URL: https://doi.org/10.1109/FOCS.2014.29.
  16. Frank Dabek, M. Frans Kaashoek, David R. Karger, Robert Tappan Morris, and Ion Stoica. Wide-area cooperative storage with CFS. In Keith Marzullo and Mahadev Satyanarayanan, editors, Proceedings of the 18th ACM Symposium on Operating System Principles, SOSP 2001, Chateau Lake Louise, Banff, Alberta, Canada, October 21-24, 2001, pages 202-215. ACM, 2001. URL: https://doi.org/10.1145/502034.502054.
  17. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: amazon’s highly available key-value store. In Thomas C. Bressoud and M. Frans Kaashoek, editors, Proceedings of the 21st ACM Symposium on Operating Systems Principles 2007, SOSP 2007, Stevenson, Washington, USA, October 14-17, 2007, pages 205-220. ACM, 2007. URL: https://doi.org/10.1145/1294261.1294281.
  18. Peter J Denning. The working set model for program behavior. Commun. ACM, 1968. URL: https://doi.org/10.1145/363095.363141.
  19. Christoph Dobraunig, Maria Eichlseder, and Florian Mendel. Analysis of SHA-512/224 and SHA-512/256. IACR Cryptol. ePrint Arch., 2016. URL: http://eprint.iacr.org/2016/374.
  20. Amos Fiat, Richard M. Karp, Michael Luby, Lyle A. McGeoch, Daniel Dominic Sleator, and Neal E. Young. Competitive paging algorithms. J. Algorithms, 1991. URL: https://doi.org/10.1016/0196-6774(91)90041-V.
  21. Aleksander Figiel, Janne H. Korhonen, Neil Olver, and Stefan Schmid. Efficient algorithms for demand-aware networks and a connection to virtual network embedding. In International Conference on Principles of Distributed Systems (OPODIS), 2024. URL: https://doi.org/10.4230/LIPIcs.CVIT.2016.23.
  22. Aleksander Figiel, Darya Melnyk, Andre Nichterlein, Arash Pourdamghani, and Stefan Schmid. Spiderdan: Matching augmentation in demand-aware networks. In SIAM Symposium on Algorithm Engineering and Experiments (ALENEX), 2025. Google Scholar
  23. Alexander Fuerst and Prateek Sharma. Locality-aware load-balancing for serverless clusters. In Jon B. Weissman, Abhishek Chandra, Ada Gavrilovska, and Devesh Tiwari, editors, HPDC '22: The 31st International Symposium on High-Performance Parallel and Distributed Computing, Minneapolis, MN, USA, 27 June 2022 - 1 July 2022, pages 227-239. ACM, 2022. URL: https://doi.org/10.1145/3502181.3531459.
  24. Brighten Godfrey, Karthik Lakshminarayanan, Sonesh Surana, Richard M. Karp, and Ion Stoica. Load balancing in dynamic structured P2P systems. In Proceedings IEEE INFOCOM 2004, The 23rd Annual Joint Conference of the IEEE Computer and Communications Societies, Hong Kong, China, March 7-11, 2004, pages 2253-2262. IEEE, 2004. URL: https://doi.org/10.1109/INFCOM.2004.1354648.
  25. Nicholas J. A. Harvey. A first course in randomized algorithms. Book draft, 2022. URL: https://doi.org/10.48550/arXiv.cs/0601026.
  26. Wassily Hoeffding. Probability inequalities for sums of bounded random variables. JASA, 1963. Google Scholar
  27. John D. Hunter. Matplotlib: A 2d graphics environment. Comput. Sci. Eng., 2007. URL: https://doi.org/10.1109/MCSE.2007.55.
  28. Kaiyi Ji, Guocong Quan, and Jian Tan. Asymptotic miss ratio of LRU caching with consistent hashing. In IEEE INFOCOM. IEEE, 2018. URL: https://doi.org/10.1109/INFOCOM.2018.8485860.
  29. Edward G. Coffman Jr. and Peter J. Denning. Operating Systems Theory. Prentice-Hall, 1973. Google Scholar
  30. Aarati Kakaraparthy, Jignesh M. Patel, Brian Kroth, and Kwanghyun Park. VIP hashing - adapting to skew in popularity of data on the fly. VLDB, 2022. URL: https://doi.org/10.14778/3547305.3547306.
  31. David R. Karger, Eric Lehman, Frank Thomson Leighton, Rina Panigrahy, Matthew S. Levine, and Daniel Lewin. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In Frank Thomson Leighton and Peter W. Shor, editors, Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, El Paso, Texas, USA, May 4-6, 1997, pages 654-663. ACM, 1997. URL: https://doi.org/10.1145/258533.258660.
  32. David R. Karger and Matthias Ruhl. Simple efficient load balancing algorithms for peer-to-peer systems. In Phillip B. Gibbons and Micah Adler, editors, SPAA 2004: Proceedings of the Sixteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, June 27-30, 2004, Barcelona, Spain, pages 36-43. ACM, 2004. URL: https://doi.org/10.1145/1007912.1007919.
  33. David R. Karger, Alex Sherman, Andy Berkheimer, Bill Bogstad, Rizwan Dhanidina, Ken Iwamoto, Brian Kim, Luke Matkins, and Yoav Yerushalmi. Web caching with consistent hashing. Comput. Networks, 1999. URL: https://doi.org/10.1016/S1389-1286(99)00055-9.
  34. Avinash Lakshman and Prashant Malik. Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev., 2010. URL: https://doi.org/10.1145/1773912.1773922.
  35. John Lamping and Eric Veach. A fast, minimal memory, consistent hash algorithm. CoRR, 2014. URL: https://doi.org/10.48550/arXiv.1406.2294.
  36. Eng Keong Lua, Jon Crowcroft, Marcelo Pias, Ravi Sharma, and Steven Lim. A survey and comparison of peer-to-peer overlay network schemes. IEEE Commun. Surv. Tutorials, 2005. URL: https://doi.org/10.1109/COMST.2005.1610546.
  37. Petar Maymounkov and David Mazières. Kademlia: A peer-to-peer information system based on the XOR metric. In IPTPS. Springer, 2002. URL: https://doi.org/10.1007/3-540-45748-8_5.
  38. Seyedehmehrnaz Mireslami, Logan Rakai, Mea Wang, and Behrouz Homayoun Far. Dynamic cloud resource allocation considering demand uncertainty. IEEE Trans. Cloud Comput., 2021. URL: https://doi.org/10.1109/TCC.2019.2897304.
  39. Vahab S. Mirrokni, Mikkel Thorup, and Morteza Zadimoghaddam. Consistent hashing with bounded loads. In Artur Czumaj, editor, Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, pages 587-604. SIAM, 2018. URL: https://doi.org/10.1137/1.9781611975031.39.
  40. Moni Naor and Udi Wieder. Novel architectures for P2P applications: the continuous-discrete approach. In SPAA. ACM, 2003. URL: https://doi.org/10.1145/777412.777421.
  41. Petros Nicopolitidis, Georgios I Papadimitriou, and Andreas S Pomportsis. Exploiting locality of demand to improve the performance of wireless data broadcasting. IEEE Trans. Veh. Technol., 2006. URL: https://doi.org/10.1109/TVT.2006.877464.
  42. M. Tamer Özsu and Patrick Valduriez. Principles of Distributed Database Systems, 4th Edition. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-26253-2.
  43. Linda Pagli. Self-adjusting hash tables. Inf. Process. Lett., 1985. URL: https://doi.org/10.1016/0020-0190(85)90103-6.
  44. Arash Pourdamghani, Chen Avin, Robert Sama, and Stefan Schmid. Seedtree: A dynamically optimal and local self-adjusting tree. In IEEE INFOCOM 2023 - IEEE Conference on Computer Communications, New York City, NY, USA, May 17-20, 2023, pages 1-10. IEEE, 2023. URL: https://doi.org/10.1109/INFOCOM53939.2023.10228999.
  45. Arash Pourdamghani, Chen Avin, Robert Sama, Maryam Shiran, and Stefan Schmid. Hash-And-Adjust. Software, version 1.0. (visited on 2024-11-13). . (visited on 2024-11-13). URL: https://github.com/inet-tub/Hash-And-Adjust
    Software Heritage Logo archived version
    full metadata available at: https://doi.org/10.4230/artifacts.22600
  46. Andrew Rodland. Improving load balancing with a new consistent-hashing algorithm. Vimeo Engineering Blog, Medium, 2016. Google Scholar
  47. Tim Roughgarden. Beyond worst-case analysis. Commun. ACM, 2019. URL: https://doi.org/10.1145/3232535.
  48. Arjun Roy, Hongyi Zeng, Jasmeet Bagga, George Porter, and Alex C. Snoeren. Inside the social network’s (datacenter) network. In Steve Uhlig, Olaf Maennel, Brad Karp, and Jitendra Padhye, editors, Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, SIGCOMM 2015, London, United Kingdom, August 17-21, 2015, pages 123-137. ACM, 2015. URL: https://doi.org/10.1145/2785956.2787472.
  49. Swaminathan Sivasubramanian. Amazon dynamodb: a seamlessly scalable non-relational database service. In K. Selçuk Candan, Yi Chen, Richard T. Snodgrass, Luis Gravano, and Ariel Fuxman, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, AZ, USA, May 20-24, 2012, pages 729-730. ACM, 2012. URL: https://doi.org/10.1145/2213836.2213945.
  50. Daniel Dominic Sleator and Robert Endre Tarjan. Amortized efficiency of list update rules. In Richard A. DeMillo, editor, Proceedings of the 16th Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1984, Washington, DC, USA, pages 488-492. ACM, 1984. URL: https://doi.org/10.1145/800057.808718.
  51. Ion Stoica, Robert Tappan Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, and Hari Balakrishnan. Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw., 2003. URL: https://doi.org/10.1109/TNET.2002.808407.
  52. Willy Tarreau et al. Haproxy-the reliable, high-performance tcp/http load balancer. https://www.haproxy.org, 2012. Google Scholar
  53. Mikkel Thorup and Yin Zhang. Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation. SIAM J. Comput., 2012. URL: https://doi.org/10.1137/100800774.
  54. Michael L. Waskom. seaborn: statistical data visualization. J. Open Source Softw., 6(60):3021, 2021. URL: https://doi.org/10.21105/joss.03021.
  55. James Wogulis. Self-adjusting and split sequence hash tables. Inf. Process. Lett., 1989. URL: https://doi.org/10.1016/0020-0190(89)90210-X.
  56. Min Xiang, Yuzhou Jiang, Zhong Xia, and Chunmei Huang. Consistent hashing with bounded loads and virtual nodes-based load balancing strategy for proxy cache cluster. Clust. Comput., 2020. URL: https://doi.org/10.1007/s10586-020-03076-4.
  57. Zhisheng Ye, Min Xie, and Loon-Ching Tang. Reliability evaluation of hard disk drive failures based on counting processes. Reliab. Eng. Syst. Saf., 2013. URL: https://doi.org/10.1016/j.ress.2012.07.003.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail