State Machine Replication Is More Expensive Than Consensus

Antoniadis, Karolos; Guerraoui, Rachid; Malkhi, Dahlia; Seredinschi, Dragos-Adrian

doi:10.4230/LIPIcs.DISC.2018.7

Abstract

Consensus and State Machine Replication (SMR) are generally considered to be equivalent problems. In certain system models, indeed, the two problems are computationally equivalent: any solution to the former problem leads to a solution to the latter, and vice versa. 
In this paper, we study the relation between consensus and SMR from a complexity perspective. We find that, surprisingly, completing an SMR command can be more expensive than solving a consensus instance. Specifically, given a synchronous system model where every instance of consensus always terminates in constant time, completing an SMR command does not necessarily terminate in constant time. This result naturally extends to partially synchronous models. Besides theoretical interest, our result also corresponds to practical phenomena we identify empirically. We experiment with two well-known SMR implementations (Multi-Paxos and Raft) and show that, indeed, SMR is more expensive than consensus in practice. One important implication of our result is that - even under synchrony conditions - no SMR algorithm can ensure bounded response times.

Amazon EC2. http://aws.amazon.com/ec2/. [Online; accessed 9-May-2018].
etcd. https://github.com/coreos/etcd. [Online; accessed 9-May-2018].
LibPaxos3. https://bitbucket.org/sciascid/libpaxos. [Online; accessed 9-May-2018].
Karolos Antoniadis, Rachid Guerraoui, Dahlia Malkhi, and Dragos-Adrian Seredinschi. State Machine Replication is More Expensive Than Consensus. Technical Report 256238, EPFL, 2018. URL: https://infoscience.epfl.ch/record/256238.
B. Arun, S. Peluso, R. Palmieri, G. Losa, and B. Ravindran. Speeding up Consensus by Chasing Fast Decisions. In DSN, 2017.
Peter Bailis and Kyle Kingsbury. The network is reliable. ACM Queue, 12(7):20, 2014.
Alysson Bessani, Marcel Santos, João Felix, Nuno Neves, and Miguel Correia. On the efficiency of durable state machine replication. In ATC, 2013.
Alysson Bessani, João Sousa, and Eduardo EP Alchieri. State machine replication for the masses with bft-smart. In DSN, 2014.
Martin Biely, Peter Robinson, Ulrich Schmid, Manfred Schwarz, and Kyrill Winkler. Gracefully degrading consensus and k-set agreement in directed dynamic networks. Theoretical Computer Science, 726:41-77, 2018.
Christian Cachin, Rachid Guerraoui, and Luìs Rodrigues. Introduction to Reliable and Secure Distributed Programming. Springer, 2011.
Lásaro Jonas Camargos, Rodrigo Malta Schmidt, and Fernando Pedone. Multicoordinated agreement protocols for higher availability. In Network Computing and Applications, 2008.
Daniel Cason, Parisa J Marandi, Luiz E Buzato, and Fernando Pedone. Chasing the tail of atomic broadcast protocols. In SRDS, 2015.
Tushar D. Chandra, Robert Griesemer, and Joshua Redstone. Paxos made live: An engineering perspective. In PODC, 2007.
Tushar D. Chandra and Sam Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the ACM (JACM), 43(2):225-267, 1996.
Bernadette Charron-Bost and André Schiper. The heard-of model: computing in distributed systems with benign faults. Distributed Computing, 22(1):49-71, Apr 2009.
James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford. Spanner: Google’s globally distributed database. ACM TOCS, 31(3):8:1-8:22, 2013.
Jeffrey Dean and Luiz André Barroso. The tail at scale. Communications of the ACM, 56(2):74-80, 2013.
Tzilla Elrad and Nissim Francez. Decomposition of distributed programs into communication-closed layers. Science of Computer Programming, 2(3):155-173, 1982.
Michael J Fischer, Nancy A Lynch, and Michael S Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM (JACM), 32(2):374-382, 1985.
Eli Gafni. Round-by-round fault detectors (extended abstract): Unifying synchrony and asynchrony. In PODC, 1998.
Álvaro García-Pérez, Alexey Gotsman, Yuri Meshman, and Ilya Sergey. Paxos consensus, deconstructed and abstracted (extended version). CoRR, abs/1802.05969, 2018.
Chryssis Georgiou, Seth Gilbert, Rachid Guerraoui, and Dariusz R. Kowalski. On the complexity of asynchronous gossip. In PODC, 2008.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google File System. In SOSP, 2003.
Rachid Guerraoui, Matej Pavlovic, and Dragos-Adrian Seredinschi. Incremental consistency guarantees for replicated objects. In OSDI, 2016.
Heidi Howard and Jon Crowcroft. Coracle: Evaluating Consensus at the Internet Edge. In SIGCOMM, 2015.
Heidi Howard, Dahlia Malkhi, and Alexander Spiegelman. Flexible Paxos: Quorum Intersection Revisited. In OPODIS, 2016.
Patrick Hunt, Mahadev Konar, Flavio Paiva Junqueira, and Benjamin Reed. Zookeeper: Wait-free coordination for internet-scale systems. In USENIX ATC, 2010.
Idit Keidar and Sergio Rajsbaum. On the cost of fault-tolerant consensus when there are no faults: Preliminary version. SIGACT News, 32(2):45-63, 2001.
M. Kleppmann. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O'Reilly Media, 2017.
Tim Kraska, Gene Pang, Michael J Franklin, Samuel Madden, and Alan Fekete. MDCC: Multi-data center consistency. In EuroSys, 2013.
Leslie Lamport. Time, Clocks, and the Ordering of Events in a Distributed System. Commun. ACM, 21(7):558-565, 1978.
Leslie Lamport. The part-time parliament. ACM TOCS, 16(2):133-169, 1998.
Leslie Lamport. Paxos made simple. ACM Sigact News, 32(4):18-25, 2001.
Leslie Lamport. Lower bounds for asynchronous consensus. In Future Directions in Distributed Computing, pages 22-23. Springer, 2003.
Leslie Lamport. Fast paxos. Distributed Computing, 19(2):79-103, 2006.
Leslie Lamport, Dahlia Malkhi, and Lidong Zhou. Stoppable Paxos. TechReport, Microsoft Research, 2008.
Leslie Lamport and Mike Massa. Cheap paxos. In DSN, 2004.
Barbara Liskov, Sanjay Ghemawat, Robert Gruber, Paul Johnson, Liuba Shrira, and Michael Williams. Replication in the Harp File System. In SOSP, 1991.
John MacCormick, Nick Murphy, Marc Najork, Chandu Thekkath, and Lidong Zhou. Boxwood: Abstractions as the foundation for storage infrastructure. In OSDI, 2004.
Iulian Moraru, David G. Andersen, and Michael Kaminsky. There is More Consensus in Egalitarian Parliaments. In SOSP, 2013.
Yoram Moses and Sergio Rajsbaum. A layered analysis of consensus. SIAM Journal on Computing, 31(4):989-1021, 2002.
A. Mostefaoui and M. Raynal. Low cost consensus-based atomic broadcast. In Proceedings. 2000 Pacific Rim International Symposium on Dependable Computing, 2000.
Diego Ongaro and John Ousterhout. In search of an understandable consensus algorithm. In ATC, 2014.
Nicola Santoro and Peter Widmayer. Time is not a healer. In STACS, 1989.
Nicola Santoro and Peter Widmayer. Agreement in synchronous networks with ubiquitous faults. Theoretical Computer Science, 384(2):232-249, 2007.
Nuno Santos and André Schiper. Tuning paxos for high-throughput with batching and pipelining. In International Conference on Distributed Computing and Networking, pages 153-167. Springer, 2012.
Ulrich Schmid, Bettina Weiss, and Idit Keidar. Impossibility results and lower bounds for consensus under link failures. SIAM Journal on Computing, 38(5):1912-1951, 2009.
Ulrich Schmid, Bettina Weiss, and John Rushby. Formally verified byzantine agreement in presence of link faults. In ICDCS, 2002.
Fred B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Comput. Surv., 22(4):299-319, 1990.
Emil Sit, Andreas Haeberlen, Frank Dabek, Byung-Gon Chun, Hakim Weatherspoon, Robert Morris, M Frans Kaashoek, and John Kubiatowicz. Proactive Replication for Data Durability. In IPTPS, 2006.
Robert H Thomas. A majority consensus approach to concurrency control for multiple copy databases. ACM TODS, 4(2):180-209, 1979.
Gustavo M. D. Vieira, Islene C. Garcia, and Luiz Eduardo Buzato. Seamless paxos coordinators. CoRR, abs/1710.07845, 2017. URL: http://arxiv.org/abs/1710.07845.
Matt Welsh, David Culler, and Eric Brewer. Seda: An architecture for well-conditioned, scalable internet services. In SOSP, 2001.
Benjamin Wester, James A Cowling, Edmund B Nightingale, Peter M Chen, Jason Flinn, and Barbara Liskov. Tolerating latency in replicated state machines through client speculation. In NSDI, 2009.

State Machine Replication Is More Expensive Than Consensus

Authors Karolos Antoniadis, Rachid Guerraoui, Dahlia Malkhi, Dragos-Adrian Seredinschi

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

State Machine Replication Is More Expensive Than Consensus

Authors Karolos Antoniadis, Rachid Guerraoui, Dahlia Malkhi, Dragos-Adrian Seredinschi

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message