Liveness and Latency of Byzantine State-Machine Replication

Authors Manuel Bravo, Gregory Chockler, Alexey Gotsman



PDF
Thumbnail PDF

File

LIPIcs.DISC.2022.12.pdf
  • Filesize: 0.88 MB
  • 19 pages

Document Identifiers

Author Details

Manuel Bravo
  • Informal Systems, Madrid, Spain
Gregory Chockler
  • University of Surrey, UK
Alexey Gotsman
  • IMDEA Software Institute, Madrid, Spain

Acknowledgements

We thank the following people for comments that helped improve the paper: Lăcrămioara Aştefănoaei, Hagit Attiya, Alysson Besani, Armando Castañeda, Peter Davies, Dan O'Keeffe, Idit Keidar, Giuliano Losa, Alejandro Naser, and Eugen Zălinescu.

Cite As Get BibTex

Manuel Bravo, Gregory Chockler, and Alexey Gotsman. Liveness and Latency of Byzantine State-Machine Replication. In 36th International Symposium on Distributed Computing (DISC 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 246, pp. 12:1-12:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022) https://doi.org/10.4230/LIPIcs.DISC.2022.12

Abstract

Byzantine state-machine replication (SMR) ensures the consistency of replicated state in the presence of malicious replicas and lies at the heart of the modern blockchain technology. Byzantine SMR protocols often guarantee safety under all circumstances and liveness only under synchrony. However, guaranteeing liveness even under this assumption is nontrivial. So far we have lacked systematic ways of incorporating liveness mechanisms into Byzantine SMR protocols, which often led to subtle bugs. To close this gap, we introduce a modular framework to facilitate the design of provably live and efficient Byzantine SMR protocols. Our framework relies on a view abstraction generated by a special SMR synchronizer primitive to drive the agreement on command ordering. We present a simple formal specification of an SMR synchronizer and its bounded-space implementation under partial synchrony. We also apply our specification to prove liveness and analyze the latency of three Byzantine SMR protocols via a uniform methodology. In particular, one of these results yields what we believe is the first rigorous liveness proof for the algorithmic core of the seminal PBFT protocol.

Subject Classification

ACM Subject Classification
  • Theory of computation → Distributed computing models
Keywords
  • Replication
  • blockchain
  • partial synchrony
  • liveness

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. DiemBFT v4: State machine replication in the Diem blockchain. URL: https://developers.diem.com/papers/diem-consensus-state-machine-replication-in-the-diem-blockchain/2021-08-17.pdf.
  2. Incorrect by construction-CBC Casper isn't live. URL: https://derekhsorensen.com/docs/CBC_Casper_Flaw.pdf.
  3. Ittai Abraham, Srinivas Devadas, Danny Dolev, Kartik Nayak, and Ling Ren. Synchronous Byzantine agreement with expected O(1) rounds, expected O(n²) communication, and optimal resilience. In Conference on Financial Cryptography and Data Security (FC), 2019. Google Scholar
  4. Ittai Abraham, Guy Gueta, Dahlia Malkhi, Lorenzo Alvisi, Ramakrishna Kotla, and Jean-Philippe Martin. Revisiting fast practical Byzantine fault tolerance. arXiv, abs/1712.01367, 2017. URL: http://arxiv.org/abs/1712.01367.
  5. Ittai Abraham, Kartik Nayak, Ling Ren, and Zhuolun Xiang. Good-case latency of Byzantine broadcast: a complete categorization. In Symposium on Principles of Distributed Computing (PODC), 2021. Google Scholar
  6. Lăcrămioara Aştefănoaei, Pierre Chambart, Antonella Del Pozzo, Thibault Rieutord, Sara Tucci-Piergiovanni, and Eugen Zălinescu. Tenderbake - A solution to dynamic repeated consensus for blockchains. In Symposium on Foundations and Applications of Blockchain (FAB), 2021. Google Scholar
  7. Dan Alistarh, Seth Gilbert, Rachid Guerraoui, and Corentin Travers. Generating fast indulgent algorithms. In International Conference on Distributed Computing and Networking (ICDCN), 2011. Google Scholar
  8. Yackolley Amoussou-Guenou, Antonella Del Pozzo, Maria Potop-Butucaru, and Sara Tucci-Piergiovanni. Correctness of Tendermint-core blockchains. In Conference on Principles of Distributed Systems (OPODIS), 2018. Google Scholar
  9. Yackolley Amoussou-Guenou, Antonella Del Pozzo, Maria Potop-Butucaru, and Sara Tucci-Piergiovanni. Dissecting Tendermint. In Conference on Networked Systems (NETYS), 2019. Google Scholar
  10. Baruch Awerbuch. Complexity of network synchronization. J. ACM, 32(4):804-823, 1985. Google Scholar
  11. Rida A. Bazzi and Yin Ding. Non-skipping timestamps for Byzantine data storage systems. In Symposium on Distributed Computing (DISC), 2004. Google Scholar
  12. Christian Berger, Hans P. Reiser, and Alysson Bessani. Making reads in BFT state machine replication fast, linearizable, and live. In Symposium on Reliable Distributed Systems (SRDS), 2021. Google Scholar
  13. Alysson Neves Bessani, João Sousa, and Eduardo Adílio Pelinson Alchieri. State machine replication for the masses with BFT-SMART. In Conference on Dependable Systems and Networks (DSN), 2014. Google Scholar
  14. Martin Biely, Josef Widder, Bernadette Charron-Bost, Antoine Gaillard, Martin Hutle, and André Schiper. Tolerating corrupted communication. In Symposium on Principles of Distributed Computing (PODC), 2007. Google Scholar
  15. Gabriel Bracha. Asynchronous Byzantine agreement protocols. Inf. Comput., 75(2):130-143, 1987. Google Scholar
  16. Manuel Bravo, Gregory Chockler, and Alexey Gotsman. Making Byzantine consensus live. In Symposium on Distributed Computing (DISC), 2020. Google Scholar
  17. Manuel Bravo, Gregory Chockler, and Alexey Gotsman. Liveness and latency of Byzantine state-machine replication (extended version). arXiv, abs/2202.06679, 2022. URL: http://arxiv.org/abs/2202.06679.
  18. Ethan Buchman, Jae Kwon, and Zarko Milosevic. The latest gossip on BFT consensus. arXiv, abs/1807.04938, 2018. URL: http://arxiv.org/abs/1807.04938.
  19. Christian Cachin. Personal communication, 2022. Google Scholar
  20. Christian Cachin, Rachid Guerraoui, and Luís E. T. Rodrigues. Introduction to Reliable and Secure Distributed Programming (2 ed.). Springer, 2011. Google Scholar
  21. Christian Cachin, Klaus Kursawe, Frank Petzold, and Victor Shoup. Secure and efficient asynchronous broadcast protocols. In International Cryptology Conference (CRYPTO), 2001. Google Scholar
  22. Christian Cachin and Marko Vukolić. Blockchain consensus protocols in the wild (keynote talk). In Symposium on Distributed Computing (DISC), 2017. Google Scholar
  23. Miguel Castro. Practical Byzantine Fault Tolerance. PhD thesis, Massachusetts Institute of Technology, 2001. Google Scholar
  24. Miguel Castro and Barbara Liskov. Practical Byzantine fault tolerance. In Symposium on Operating Systems Design and Implementation (OSDI), 1999. Google Scholar
  25. Miguel Castro and Barbara Liskov. Practical Byzantine fault tolerance and proactive recovery. ACM Transactions on Computer Systems, 20(4):398-461, 2002. Google Scholar
  26. Tushar Deepak Chandra and Sam Toueg. Unreliable failure detectors for reliable distributed systems. J. ACM, 43(2):225-267, 1996. Google Scholar
  27. Bernadette Charron-Bost and André Schiper. The Heard-Of model: computing in distributed systems with benign faults. Distributed Comput., 22(1):49-71, 2009. Google Scholar
  28. Allen Clement, Edmund Wong, Lorenzo Alvisi, Mike Dahlin, and Mirco Marchetti. Making Byzantine fault tolerant systems tolerate Byzantine faults. In Symposium on Networked Systems Design and Implementation (NSDI), 2009. Google Scholar
  29. Danny Dolev, Joseph Y. Halpern, Barbara Simons, and Ray Strong. Dynamic fault-tolerant clock synchronization. J. ACM, 42(1):143-185, 1995. Google Scholar
  30. Assia Doudou, Benoît Garbinato, and Rachid Guerraoui. Abstractions for devising Byzantine-resilient state machine replication. In Symposium on Reliable Distributed Systems (SRDS), 2000. Google Scholar
  31. Cezara Dragoi, Josef Widder, and Damien Zufferey. Programming at the edge of synchrony. Proc. ACM Program. Lang., 4(OOPSLA), 2020. Google Scholar
  32. Cynthia Dwork, Nancy A. Lynch, and Larry J. Stockmeyer. Consensus in the presence of partial synchrony. J. ACM, 35(2):288-323, 1988. Google Scholar
  33. Michael J. Fischer, Nancy A. Lynch, and Mike Paterson. Impossibility of distributed consensus with one faulty process. J. ACM, 32(2):374-382, 1985. Google Scholar
  34. Felix C. Freiling, Rachid Guerraoui, and Petr Kuznetsov. The failure detector abstraction. ACM Comput. Surv., 43(2):9:1-9:40, 2011. Google Scholar
  35. Eli Gafni. Round-by-round fault detectors: Unifying synchrony and asynchrony. In Symposium on Principles of Distributed Computing (PODC), 1998. Google Scholar
  36. Seth Gilbert, Rachid Guerraoui, and Dariusz R. Kowalski. On the message complexity of indulgent consensus. In Symposium on Distributed Computing (DISC), 2007. Google Scholar
  37. Guy Golan-Gueta, Ittai Abraham, Shelly Grossman, Dahlia Malkhi, Benny Pinkas, Michael K. Reiter, Dragos-Adrian Seredinschi, Orr Tamir, and Alin Tomescu. SBFT: A scalable and decentralized trust infrastructure. In Conference on Dependable Systems and Networks (DSN), 2019. Google Scholar
  38. Rachid Guerraoui. Indulgent algorithms (preliminary version). In Symposium on Principles of Distributed Computing (PODC), 2000. Google Scholar
  39. Rachid Guerraoui and Michel Raynal. The information structure of indulgent consensus. IEEE Transactions on Computers, 53(4):453-466, 2004. Google Scholar
  40. Andreas Haeberlen and Petr Kuznetsov. The fault detection problem. In Conference on Principles of Distributed Systems (OPODIS), 2009. Google Scholar
  41. Amir Herzberg and Shay Kutten. Fast isolation of arbitrary forwarding faults. In Symposium on Principles of Distributed Computing (PODC), 1989. Google Scholar
  42. Idit Keidar and Alexander Shraer. Timeliness, failure-detectors, and consensus performance. In Symposium on Principles of Distributed Computing (PODC), 2006. Google Scholar
  43. Kim Potter Kihlstrom, Louise E. Moser, and P. M. Melliar-Smith. Byzantine fault detectors for solving consensus. The Computer Journal, 46(1):16-35, 2003. Google Scholar
  44. Ramakrishna Kotla, Lorenzo Alvisi, Mike Dahlin, Allen Clement, and Edmund Wong. Zyzzyva: Speculative Byzantine fault tolerance. ACM Trans. Comput. Syst., 27(4):7:1-7:39, 2010. Google Scholar
  45. Leslie Lamport. The part-time parliament. ACM Trans. Comput. Syst., 16(2):133-169, 1998. Google Scholar
  46. Dahlia Malkhi and Michael Reiter. Unreliable intrusion detection in distributed computations. In Workshop on Computer Security Foundations (CSFW), 1997. Google Scholar
  47. Achour Mostéfaoui and Michel Raynal. Solving consensus using Chandra-Toueg’s unreliable failure detectors: A general quorum-based approach. In Symposium on Distributed Computing (DISC), 1999. Google Scholar
  48. Oded Naor, Mathieu Baudet, Dahlia Malkhi, and Alexander Spiegelman. Cogsworth: Byzantine view synchronization. In Cryptoeconomics Systems Conference (CES), 2020. Google Scholar
  49. Oded Naor and Idit Keidar. Expected linear round synchronization: The missing link for linear Byzantine SMR. In Symposium on Distributed Computing (DISC), 2020. Google Scholar
  50. Rafael Pass and Elaine Shi. Hybrid consensus: Efficient consensus in the permissionless model. In Symposium on Distributed Computing (DISC), 2017. Google Scholar
  51. Rafael Pass and Elaine Shi. Thunderella: Blockchains with optimistic instant confirmation. In Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), 2018. Google Scholar
  52. Fred B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Comput. Surv., 22(4):299-319, 1990. Google Scholar
  53. Barbara Simons, Jennifer Welch, and Nancy Lynch. An overview of clock synchronization. In Fault-Tolerant Distributed Computing, 1986. Google Scholar
  54. João Sousa. Byzantine State Machine Replication for the Masses. PhD thesis, University of Lisbon, 2017. Google Scholar
  55. Chrysoula Stathakopoulou, Tudor David, and Marko Vukolić. Mir-BFT: High-throughput BFT for blockchains. arXiv, abs/1906.05552, 2019. URL: http://arxiv.org/abs/1906.05552.
  56. Giuliana Santos Veronese, Miguel Correia, Alysson Neves Bessani, and Lau Cheuk Lung. Spin one’s wheels? Byzantine fault tolerance with a spinning primary. In Symposium on Reliable Distributed Systems (SRDS), 2009. Google Scholar
  57. Maofan Yin, Dahlia Malkhi, Michael K. Reiter, Guy Golan-Gueta, and Ittai Abraham. HotStuff: BFT consensus with linearity and responsiveness. In Symposium on Principles of Distributed Computing (PODC), 2019. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail