Document Open Access Logo

Distributed Dispatching in the Parallel Server Model

Authors Guy Goren , Shay Vargaftik , Yoram Moses

Thumbnail PDF


  • Filesize: 1.03 MB
  • 18 pages

Document Identifiers

Author Details

Guy Goren
  • Technion - Israel Institute of Technology, Haifa, Israel
Shay Vargaftik
  • VMware Research
Yoram Moses
  • Technion - Israel Institute of Technology, Haifa, Israel

Cite AsGet BibTex

Guy Goren, Shay Vargaftik, and Yoram Moses. Distributed Dispatching in the Parallel Server Model. In 34th International Symposium on Distributed Computing (DISC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 179, pp. 14:1-14:18, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)


With the rapid increase in the size and volume of cloud services and data centers, architectures with multiple job dispatchers are quickly becoming the norm. Load balancing is a key element of such systems. Nevertheless, current solutions to load balancing in such systems admit a paradoxical behavior in which more accurate information regarding server queue lengths degrades performance due to herding and detrimental incast effects. Indeed, both in theory and in practice, there is a common doubt regarding the value of information in the context of multi-dispatcher load balancing. As a result, both researchers and system designers resort to more straightforward solutions, such as the power-of-two-choices to avoid worst-case scenarios, potentially sacrificing overall resource utilization and system performance. A principal focus of our investigation concerns the value of information about queue lengths in the multi-dispatcher setting. We argue that, at its core, load balancing with multiple dispatchers is a distributed computing task. In that light, we propose a new job dispatching approach, called Tidal Water Filling, which addresses the distributed nature of the system. Specifically, by incorporating the existence of other dispatchers into the decision-making process, our protocols outperform previous solutions in many scenarios. In particular, when the dispatchers have complete and accurate information regarding the server queue lengths, our policies significantly outperform all existing solutions.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Distributed algorithms
  • Networks → Network algorithms
  • Theory of computation → Online algorithms
  • Distributed load balancing
  • Join the Shortest Queue
  • Tidal Water Filling
  • Parallel Server Model


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Micah Adler, Soumen Chakrabarti, Michael Mitzenmacher, and Lars Rasmussen. Parallel randomized load balancing. Random Structures & Algorithms, 13(2):159-188, 1998. Google Scholar
  2. Jonatha Anselmi and Francois Dufour. Power-of-d-choices with memory: Fluid limit and optimality. Mathematics of Operations Research, 2020. Google Scholar
  3. Jeffrey Dean and Luiz André Barroso. The tail at scale. Communications of the ACM, 56(2):74-80, 2013. Google Scholar
  4. Daniel E Eisenbud, Cheng Yi, Carlo Contavalli, Cody Smith, Roman Kononov, Eric Mann-Hielscher, Ardas Cilingiroglu, Bin Cheyney, Wentao Shang, and Jinnah Dylan Hosein. Maglev: A fast and reliable software network load balancer. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 523-535, 2016. Google Scholar
  5. Atilla Eryilmaz and Rayadurgam Srikant. Asymptotically tight steady-state queue length bounds implied by drift conditions. Queueing Systems, 72(3-4):311-359, 2012. Google Scholar
  6. Rohan Gandhi, Hongqiang Harry Liu, Y Charlie Hu, Guohan Lu, Jitendra Padhye, Lihua Yuan, and Ming Zhang. Duet: Cloud scale load balancing with hardware and software. ACM SIGCOMM Computer Communication Review, 44(4):27-38, 2014. Google Scholar
  7. Owen Garrett. NGINX and the "Power of Two Choices" Load-Balancing Algorithm. , published on November 12, 2018. URL:
  8. Guy Goren, Shay Vargaftik, and Yoram Moses. Distributed dispatching in the parallel server model, 2020. URL:
  9. Richard M Karp. Reducibility among combinatorial problems. In Complexity of computer computations, pages 85-103. Springer, 1972. Google Scholar
  10. William Karush. Minima of functions of several variables with inequalities as side conditions. Master’s Thesis, Department of Mathematics, University of Chicago, 1939. Google Scholar
  11. Robert Kleinberg, Georgios Piliouras, and Éva Tardos. Load balancing without regret in the bulletin board model. Distributed Computing, 24(1):21-29, 2011. Google Scholar
  12. Harold W Kuhn and Albert W Tucker. Nonlinear programming. In Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, pages 481-492, Berkeley, Calif., 1951. University of California Press. URL:
  13. Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. Commun. ACM, 21(7):558-565, 1978. URL:
  14. Daniel Lehmann and Michael O Rabin. On the advantages of free choice: A symmetric and fully distributed solution to the dining philosophers problem. In John White, Richard J Lipton, and Patricia C Goldberg, editors, Conference Record of the 8th Annual ACM Symposium on Principles of Programming Languages, pages 133-138. ACM Press, 1981. URL:
  15. Christoph Lenzen and Roger Wattenhofer. Tight bounds for parallel randomized load balancing. In Proceedings of the 43rd annual ACM Symposium on Theory of Computing, pages 11-20, 2011. Google Scholar
  16. Yi Lu, Qiaomin Xie, Gabriel Kliot, Alan Geller, James R Larus, and Albert Greenberg. Join-idle-queue: A novel load balancing algorithm for dynamically scalable web services. Performance Evaluation, 68(11):1056-1071, 2011. Google Scholar
  17. Malwina J Luczak, Colin McDiarmid, et al. On the maximum queue length in the supermarket model. The Annals of Probability, 34(2):493-527, 2006. Google Scholar
  18. Tyler McMullen. Load Balancing is Impossible. Scaleconf 2016. URL:
  19. Michael Mitzenmacher. How useful is old information? IEEE Transactions on Parallel and Distributed Systems, 11(1):6-20, 2000. Google Scholar
  20. Michael Mitzenmacher. The power of two choices in randomized load balancing. IEEE Transactions on Parallel and Distributed Systems, 12(10):1094-1104, 2001. Google Scholar
  21. Michael Mitzenmacher. Analyzing distributed join-idle-queue: A fluid limit approach. In 2016 54th Annual Allerton Conference on Communication, Control, and Computing, pages 312-318. IEEE, 2016. Google Scholar
  22. Michael Mitzenmacher, Balaji Prabhakar, and Devavrat Shah. Load balancing with memory. In 43rd Annual IEEE Symposium on Foundations of Computer Science., pages 799-808. IEEE, 2002. Google Scholar
  23. YoungGyoun Moon, SeungEon Lee, Muhammad Asim Jamshed, and KyoungSoo Park. AccelTCP: Accelerating Network Applications with Stateful TCP Offloading. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 77-92, 2020. Google Scholar
  24. David Murray, Terry Koziniec, Kevin Lee, and Michael Dixon. Large MTUs and internet performance. In 13th IEEE International Conference on High Performance Switching and Routing, pages 82-87, 2012. Google Scholar
  25. Rajiv Nishtala, Paul Carpenter, Vinicius Petrucci, and Xavier Martorell. Hipster: Hybrid task manager for latency-critical cloud workloads. In IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 409-420, 2017. Google Scholar
  26. George Prekas, Marios Kogias, and Edouard Bugnion. Zygos: Achieving low tail latency for microsecond-scale networked tasks. In 26th Symposium on Operating Systems Principles (SOSP), pages 325-341, 2017. Google Scholar
  27. Michael O Rabin. N-process synchronization by 4 log₂ N-valued shared variables. In 21st Annual Symposium on Foundations of Computer Science, pages 407-410. IEEE Computer Society, 1980. URL:
  28. Eric Schurman and Jake Brutlag. The user and business impact of server delays, additional bytes, and http chunking in web search. In Velocity Web Performance and Operations Conference. O'Reilly, 2009. Google Scholar
  29. Devavrat Shah and Balaji Prabhakar. The use of memory in randomized load balancing. In Proceedings IEEE International Symposium on Information Theory, page 125, 2002. Google Scholar
  30. Mike Smith. Netflix Technology Blog. Rethinking Netflix’s Edge Load Balancing. September 2018. URL:
  31. Alexander L Stolyar. Pull-based load distribution in large-scale heterogeneous service systems. Queueing Systems, 80(4):341-361, 2015. Google Scholar
  32. Alexander L Stolyar. Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers. Queueing Systems, 85(1-2):31-65, 2017. Google Scholar
  33. Willy Tarreau. HAProxy. Test Driving “Power of Two Random Choices” Load Balancing. , published on February 15, 2019. URL:
  34. Mark van der Boor, Sem Borst, and Johan van Leeuwaarden. Hyper-scalable jsq with sparse feedback. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 3(1):1-37, 2019. Google Scholar
  35. Shay Vargaftik, Isaac Keslassy, and Ariel Orda. LSQ: Load Balancing in Large-Scale Heterogeneous Systems With Multiple Dispatchers. IEEE/ACM Transactions on Networking, 2020. Google Scholar
  36. Nikita Dmitrievna Vvedenskaya, Roland L'vovich Dobrushin, and Fridrikh Izrailevich Karpelevich. Queueing system with selection of the shortest of two queues: An asymptotic approach. Problemy Peredachi Informatsii, 32(1):20-34, 1996. Google Scholar
  37. Chunpu Wang, Chen Feng, and Julian Cheng. Distributed join-the-idle-queue for low latency cloud services. IEEE/ACM Transactions on Networking, 26(5):2309-2319, 2018. Google Scholar
  38. Richard R Weber. On the optimal assignment of customers to parallel servers. Journal of Applied Probability, 15(2):406-413, 1978. Google Scholar
  39. Wayne Winston. Optimality of the shortest line discipline. Journal of Applied Probability, 14(1):181-189, 1977. Google Scholar
  40. Xingyu Zhou, Ness Shroff, and Adam Wierman. Asymptotically optimal load balancing in large-scale heterogeneous systems with multiple dispatchers. arXiv preprint arXiv:2002.08908, 2020. Google Scholar
  41. Xingyu Zhou, Jian Tan, and Ness Shroff. Heavy-traffic delay optimality in pull-based load balancing systems: Necessary and sufficient conditions. ACM SIGMETRICS Performance Evaluation Review, 47(1):5-6, 2019. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail