Flexible Paxos: Quorum Intersection Revisited

Authors Heidi Howard, Dahlia Malkhi, Alexander Spiegelman



PDF
Thumbnail PDF

File

LIPIcs.OPODIS.2016.25.pdf
  • Filesize: 492 kB
  • 14 pages

Document Identifiers

Author Details

Heidi Howard
Dahlia Malkhi
Alexander Spiegelman

Cite As Get BibTex

Heidi Howard, Dahlia Malkhi, and Alexander Spiegelman. Flexible Paxos: Quorum Intersection Revisited. In 20th International Conference on Principles of Distributed Systems (OPODIS 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 70, pp. 25:1-25:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017) https://doi.org/10.4230/LIPIcs.OPODIS.2016.25

Abstract

Distributed consensus is integral to modern distributed systems. The widely adopted Paxos algorithm uses two phases, each requiring majority agreement, to reliably reach consensus. In this paper, we demonstrate that Paxos, which lies at the foundation of many production systems, is conservative. Specifically, we observe that each of the phases of Paxos may use non-intersecting quorums. Majority quorums are not necessary as intersection is required only across phases.

Using this weakening of the requirements made in the original formulation, we propose Flexible Paxos, which generalizes over the Paxos algorithm to provide flexible quorums. We show that Flexible Paxos is safe, e cient and easy to utilize in existing distributed systems. We discuss far reaching implications of this result. For example, improved availability results from reducing the size of second phase quorums by one when the system size is even, while keeping majority quorums in the first phase. Another example is improved throughput of replication by using much smaller phase 2 quorums, while increasing the leader election (phase 1) quorums. Finally, non intersecting quorums in either first or second phases may enhance the efficiency of both.

Subject Classification

Keywords
  • Paxos
  • Distributed Consensus
  • Quorums

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Marcos K. Aguilera, Idit Keidar, Dahlia Malkhi, and Alexander Shraer. Dynamic atomic storage without consensus. J. ACM, 58(2):7:1-7:32, April 2011. URL: http://dx.doi.org/10.1145/1944345.1944348.
  2. Mahesh Balakrishnan, Dahlia Malkhi, John D. Davis, Vijayan Prabhakaran, Michael Wei, and Ted Wobber. Corfu: A distributed shared log. ACM Trans. Comput. Syst., 31(4):10:1-10:24, December 2013. Google Scholar
  3. Ken Birman and Thomas Joseph. Exploiting virtual synchrony in distributed systems, volume 21. ACM, 1987. Google Scholar
  4. Mike Burrows. The chubby lock service for loosely-coupled distributed systems. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation, OSDI'06, pages 335-350, Berkeley, CA, USA, 2006. USENIX Association. URL: http://dl.acm.org/citation.cfm?id=1298455.1298487.
  5. Miguel Castro and Barbara Liskov. Practical byzantine fault tolerance. In Proceedings of the Third Symposium on Operating Systems Design and Implementation, OSDI'99, pages 173-186, Berkeley, CA, USA, 1999. USENIX Association. URL: http://dl.acm.org/citation.cfm?id=296806.296824.
  6. Cynthia Dwork, Nancy Lynch, and Larry Stockmeyer. Consensus in the presence of partial synchrony. Journal of the ACM (JACM), 35(2):288-323, 1988. Google Scholar
  7. Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM (JACM), 32(2):374-382, 1985. Google Scholar
  8. Eli Gafni and Dahlia Malkhi. Elastic configuration maintenance via a parsimonious speculating snapshot solution. In Proceedings of the 29th International Symposium on Distributed Computing, pages 140-153. Springer, 2015. Google Scholar
  9. David K. Gifford. Weighted voting for replicated data. In Proceedings of the Seventh ACM Symposium on Operating Systems Principles, SOSP'79, pages 150-162, New York, NY, USA, 1979. ACM. URL: http://dx.doi.org/10.1145/800215.806583.
  10. Seth Gilbert, Nancy A. Lynch, and Alexander A. Shvartsman. RAMBO: A robust, reconfigurable atomic memory service for dynamic networks. Distributed Computing, 23(4):225-272, 2010. Google Scholar
  11. Heidi Howard, Dahlia Malkhi, and Alexander Spiegelman. Flexable paxos: tlaplus specification. URL: https://github.com/fpaxos/fpaxos-tlaplus.
  12. Patrick Hunt, Mahadev Konar, Flavio P. Junqueira, and Benjamin Reed. Zookeeper: Wait-free coordination for internet-scale systems. In Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference, USENIXATC'10, pages 11-11, Berkeley, CA, USA, 2010. USENIX Association. URL: http://dl.acm.org/citation.cfm?id=1855840.1855851.
  13. Leander Jehl, Roman Vitenberg, and Hein Meling. Smartmerge: A new approach to reconfiguration for atomic storage. In International Symposium on Distributed Computing, pages 154-169. Springer, 2015. Google Scholar
  14. Flavio P. Junqueira, Benjamin C. Reed, and Marco Serafini. Zab: High-performance broadcast for primary-backup systems. In Dependable Systems &Networks (DSN), 2011 IEEE/IFIP 41st International Conference on, pages 245-256. IEEE, 2011. Google Scholar
  15. Manos Kapritsos, Yang Wang, Vivien Quema, Allen Clement, Lorenzo Alvisi, and Mike Dahlin. All about eve: execute-verify replication for multi-core servers. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), pages 237-250, 2012. Google Scholar
  16. Akhil Kumar. Hierarchical quorum consensus: A new algorithm for managing replicated data. IEEE Trans. Comput., 40(9):996-1004, September 1991. URL: http://dx.doi.org/10.1109/12.83661.
  17. Leslie Lamport. The implementation of reliable distributed multiprocess systems. Computer Networks (1976), 2(2):95-114, 1978. URL: http://dx.doi.org/10.1016/0376-5075(78)90045-4.
  18. Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558-565, 1978. Google Scholar
  19. Leslie Lamport. The part-time parliament. ACM Trans. Comput. Syst., 16(2):133-169, May 1998. URL: http://dx.doi.org/10.1145/279227.279229.
  20. Leslie Lamport. Paxos made simple. ACM SIGACT News (Distributed Computing Column), 2001. Google Scholar
  21. Leslie Lamport. Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2002. Google Scholar
  22. Leslie Lamport. Fast paxos. Technical Report MSR-TR-2005-112, Microsoft Research, 2005. Google Scholar
  23. Leslie Lamport. Generalized consensus and paxos. Technical Report MSR-TR-2005-33, Microsoft Research, March 2005. Google Scholar
  24. Leslie Lamport, Dahlia Malkhi, and Lidong Zhou. Vertical paxos and primary-backup replication. In Proceedings of the 28th ACM Symposium on Principles of Distributed Computing, PODC'09, pages 312-313, New York, NY, USA, 2009. ACM. URL: http://dx.doi.org/10.1145/1582716.1582783.
  25. Leslie Lamport, Dahlia Malkhi, and Lidong Zhou. Reconfiguring a state machine. ACM SIGACT News, 41(1):63-73, 2010. Google Scholar
  26. Leslie Lamport and Mike Massa. Cheap paxos. In Proceedings of the 2004 International Conference on Dependable Systems and Networks, DSN'04, pages 307-, Washington, DC, USA, 2004. IEEE Computer Society. URL: http://dl.acm.org/citation.cfm?id=1009382.1009745.
  27. Yanhua Mao, Flavio P. Junqueira, and Keith Marzullo. Mencius: Building efficient replicated state machines for wans. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI'08, pages 369-384, Berkeley, CA, USA, 2008. USENIX Association. URL: http://dl.acm.org/citation.cfm?id=1855741.1855767.
  28. Parisa Jalili Marandi, Marco Primi, and Fernando Pedone. Multi-ring paxos. In IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012), pages 1-12. IEEE, 2012. Google Scholar
  29. Parisa Jalili Marandi, Marco Primi, Nicolas Schiper, and Fernando Pedone. Ring paxos: A high-throughput atomic broadcast protocol. In 2010 IEEE/IFIP International Conference on Dependable Systems &Networks (DSN), pages 527-536. IEEE, 2010. Google Scholar
  30. P. J. Marandi, M. Primi, N. Schiper, and F. Pedone. Ring paxos: A high-throughput atomic broadcast protocol. In Dependable Systems and Networks (DSN), 2010 IEEE/IFIP International Conference on, pages 527-536, June 2010. URL: http://dx.doi.org/10.1109/DSN.2010.5544272.
  31. Iulian Moraru, David G. Andersen, and Michael Kaminsky. There is more consensus in egalitarian parliaments. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP'13, pages 358-372, New York, NY, USA, 2013. ACM. URL: http://dx.doi.org/10.1145/2517349.2517350.
  32. Moni Naor and Avishai Wool. The load, capacity, and availability of quorum systems. SIAM J. Comput., 27(2):423-447, April 1998. URL: http://dx.doi.org/10.1137/S0097539795281232.
  33. Brian M. Oki and Barbara H. Liskov. Viewstamped replication: A new primary copy method to support highly-available distributed systems. In Proceedings of the seventh annual ACM Symposium on Principles of distributed computing, pages 8-17. ACM, 1988. Google Scholar
  34. Diego Ongaro and John Ousterhout. In search of an understandable consensus algorithm. In Proc. USENIX Annual Technical Conference, pages 305-320, 2014. Google Scholar
  35. David Peleg and Avishai Wool. Crumbling walls: A class of practical and efficient quorum systems. In Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing, PODC'95, pages 120-129, New York, NY, USA, 1995. ACM. URL: http://dx.doi.org/10.1145/224964.224978.
  36. Fred B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Comput. Surv., 22(4):299-319, December 1990. URL: http://dx.doi.org/10.1145/98163.98167.
  37. Fred B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys (CSUR), 22(4):299-319, 1990. Google Scholar
  38. Sugu Sougoumarane. A more flexible paxos. [Online; accessed 13-Aug-2016]. URL: http://ssougou.blogspot.com/2016/08/a-more-flexible-paxos.html.
  39. Alexander Spiegelman, Idit Keidar, and Dahlia Malkhi. Dynamic reconfiguration: A tutorial. OPODIS 2015, 2015. Google Scholar
  40. P. Sutra and M. Shapiro. Fast genuine generalized consensus. In Reliable Distributed Systems (SRDS), 2011 30th IEEE Symposium on, pages 255-264, Oct 2011. URL: http://dx.doi.org/10.1109/SRDS.2011.38.
  41. Robbert Van Renesse and Deniz Altinbuken. Paxos made moderately complex. ACM Comput. Surv., 47(3):42:1-42:36, February 2015. URL: http://dx.doi.org/10.1145/2673577.
  42. Robbert Van Renesse and Deniz Altinbuken. Paxos made moderately complex. ACM Comput. Surv., 47(3):42:1-42:36, February 2015. URL: http://dx.doi.org/10.1145/2673577.
  43. Robbert Van Renesse and Fred B. Schneider. Chain replication for supporting high throughput and availability. In OSDI, volume 4, pages 91-104, 2004. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail