CFT-Forensics: High-Performance Byzantine Accountability for Crash Fault Tolerant Protocols

Tang, Weizhao; Sheng, Peiyao; Ni, Ronghao; Roy, Pronoy; Wang, Xuechao; Fanti, Giulia; Viswanath, Pramod

doi:10.4230/LIPIcs.AFT.2024.3

Abstract

Crash fault tolerant (CFT) consensus algorithms are commonly used in scenarios where system components are trusted - e.g., enterprise settings and government infrastructure. However, CFT consensus can be broken by even a single corrupt node. A desirable property in the face of such potential Byzantine faults is accountability: if a corrupt node breaks the protocol and affects consensus safety, it should be possible to identify the culpable components with cryptographic integrity from the node states. Today, the best-known protocol for providing accountability to CFT protocols is called PeerReview; it essentially records a signed transcript of all messages sent during the CFT protocol. Because PeerReview is agnostic to the underlying CFT protocol, it incurs high communication and storage overhead. We propose CFT-Forensics, an accountability framework for CFT protocols. We show that for a special family of forensics-compliant CFT protocols (which includes widely-used CFT protocols like Raft and multi-Paxos), CFT-Forensics gives provable accountability guarantees. Under realistic deployment settings, we show theoretically that CFT-Forensics operates at a fraction of the cost of PeerReview. We subsequently instantiate CFT-Forensics for Raft, and implement Raft-Forensics as an extension to the popular nuRaft library. In extensive experiments, we demonstrate that Raft-Forensics adds low overhead to vanilla Raft. With 256 byte messages, Raft-Forensics achieves a peak throughput 87.8% of vanilla Raft at 46% higher latency (+44 ms). We finally integrate Raft-Forensics into the open-source central bank digital currency OpenCBDC, and show that in wide-area network experiments, Raft-Forensics achieves 97.8% of the throughput of Raft, with 14.5% higher latency (+326 ms).

Kyle Banker, Douglas Garrett, Peter Bakkum, and Shaun Verch. MongoDB in Action: Covers MongoDB Version 3.0. Simon and Schuster, 2016.
Shehar Bano, Alberto Sonnino, Mustafa Al-Bassam, Sarah Azouvi, Patrick McCorry, Sarah Meiklejohn, and George Danezis. SoK: Consensus in the Age of Blockchains. In Proceedings of the 1st ACM Conference on Advances in Financial Technologies, pages 183-198, 2019.
Michael Ben-Or, Danny Dolev, and Ezra N. Hoch. Simple gradecast based algorithms, 2010. URL: https://arxiv.org/abs/1007.1049.
Romain Boichat, Partha Dutta, Svend Frølund, and Rachid Guerraoui. Deconstructing Paxos. SIGACT News, 34(1):47-67, March 2003. URL: https://doi.org/10.1145/637437.637447.
Mike Burrows. The Chubby Lock Service For Loosely-coupled Distributed Systems. In Proceedings of the 7th symposium on Operating systems design and implementation, pages 335-350, 2006.
Vitalik Buterin and Virgil Griffith. Casper the Friendly Finality Gadget. arXiv preprint arXiv:1710.09437, 2017.
Christian Cachin and Marko Vukolić. Blockchain Consensus Protocols in the Wild. arXiv preprint arXiv:1707.01873, 2017.
Apache Cassandra. Apache Cassandra. Website. Available online at http://planetcassandra. org/what-is-apache-cassandra, 13, 2014.
Miguel Castro, Barbara Liskov, et al. Practical Byzantine Fault Tolerance. In OsDI, volume 99, pages 173-186, 1999.
Ben Christensen. Fault Tolerance in A High Volume, Distributed System. Netflix Blog, 2012. URL: https://netflixtechblog.com/fault-tolerance-in-a-high-volume-distributed-system-91ab4faae74a.
Pierre Civit, Seth Gilbert, and Vincent Gramoli. Polygraph: Accountable Byzantine Agreement. IACR Cryptol. ePrint Arch., 2019:587, 2019.
Team Cloudify. Geo Redundancy Explained, Cloudify. Cloudify Blog, 2021. URL: https://cloudify.co/blog/geo-redundancy-explained/.
Christopher Copeland and Hongxia Zhong. Tangaroa: a byzantine fault tolerant raft. Stanford University, 2016.
James C Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, Jeffrey John Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, et al. Spanner: Google’s Globally Distributed Database. ACM Transactions on Computer Systems (TOCS), 31(3):1-22, 2013.
Roberto De Prisco, Butler Lampson, and Nancy Lynch. Revisiting the Paxos Algorithm. Theoretical Computer Science, 243(1-2):35-91, 2000.
Antonella Del Pozzo and Thibault Rieutord. Fork accountability in tenderbake. In 5th International Symposium on Foundations and Applications of Blockchain 2022 (FAB 2022). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
eBay. NuRaft. https://github.com/eBay/NuRaft/tree/v1.3, 2017. Accessed on April 19, 2023.
etcd. Etcd. https://etcd.io/, 2023. Accessed on April 19, 2023.
Mohamed Ezzeldin and Wael E El-Dakhakhni. Robustness of Ontario Power Network under Systemic Risks. Sustainable and resilient infrastructure, 6(3-4):252-271, 2021.
Yingzi Gao, Yuan Lu, Zhenliang Lu, Qiang Tang, Jing Xu, and Zhenfeng Zhang. Dumbo-NG: Fast Asynchronous BFT Consensus with Throughput-oblivious Latency. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pages 1187-1201, 2022.
Nishant Garg. Apache Kafka. Packt Publishing Birmingham, UK, 2013.
Rati Gelashvili, Lefteris Kokoris-Kogias, Alberto Sonnino, Alexander Spiegelman, and Zhuolun Xiang. Jolteon and Ditto: Network-adaptive Efficient Consensus with Asynchronous Fallback. In Financial Cryptography and Data Security: 26th International Conference, FC 2022, Grenada, May 2-6, 2022, Revised Selected Papers, pages 296-315. Springer, 2022.
Diana Ghinea, Vipul Goyal, and Chen-Da Liu-Zhang. Round-optimal byzantine agreement. Cryptology ePrint Archive, Paper 2022/255, 2022. URL: https://eprint.iacr.org/2022/255.
Mike Graf, Ralf Küsters, and Daniel Rausch. Accountability in A Permissioned Blockchain: Formal Analysis of Hyperledger Fabric. In 2020 IEEE European Symposium on Security and Privacy (EuroS&P), pages 236-255. IEEE, 2020.
Bingyong Guo, Yuan Lu, Zhenliang Lu, Qiang Tang, Jing Xu, and Zhenfeng Zhang. Speeding Dumbo: Pushing Asynchronous BFT Closer To Practice. Cryptology ePrint Archive, 2022.
Andreas Haeberlen, Petr Kouznetsov, and Peter Druschel. PeerReview: Practical Accountability For Distributed Systems. ACM SIGOPS operating systems review, 41(6):175-188, 2007.
Moin Hasan and Major Singh Goraya. Fault Tolerance in Cloud Computing Environment: A Systematic Survey. Computers in Industry, 99:156-172, 2018.
HashiCorp. Consul. https://www.consul.io/, 2023. Accessed on April 19, 2023.
Heidi Howard and Richard Mortier. Paxos vs Raft: Have We Reached Consensus on Distributed Consensus? In Proceedings of the 7th Workshop on Principles and Practice of Consistency for Distributed Data, EuroSys ’20. ACM, April 2020. URL: https://doi.org/10.1145/3380787.3393681.
Patrick Hunt, Mahadev Konar, Flavio Paiva Junqueira, and Benjamin Reed. ZooKeeper: Wait-free Coordination For Internet-scale Systems. In USENIX annual technical conference, volume 8, 2010.
Marios Kogias and Edouard Bugnion. Hovercraft: achieving scalability and fault-tolerance for microsecond-scale datacenter services. In Proceedings of the Fifteenth European Conference on Computer Systems, EuroSys '20, New York, NY, USA, 2020. Association for Computing Machinery. URL: https://doi.org/10.1145/3342195.3387545.
Robert Künnemann, Ilkan Esiyok, and Michael Backes. Automated verification of accountability in security protocols. CoRR, abs/1805.10891, 2018. URL: https://arxiv.org/abs/1805.10891.
Ralf Küsters, Tomasz Truderung, and Andreas Vogt. Accountability: Definition and Relationship To Verifiability. In Proceedings of the 17th ACM conference on Computer and communications security, pages 526-535, 2010.
Leslie Lamport. The Part-time Parliament. ACM Trans. Comput. Syst., 16(2):133-169, May 1998. URL: https://doi.org/10.1145/279227.279229.
Leslie Lamport. Paxos Made Simple. ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001), pages 51-58, December 2001. URL: https://www.microsoft.com/en-us/research/publication/paxos-made-simple/.
Leslie Lamport. The part-time parliament, pages 277-317. Association for Computing Machinery, New York, NY, USA, 2019. URL: https://doi.org/10.1145/3335772.3335939.
Butler Lampson. The ABCD’s of Paxos. In Proceedings of the Twentieth Annual ACM Symposium on Principles of Distributed Computing, PODC '01, page 13, New York, NY, USA, 2001. Association for Computing Machinery. URL: https://doi.org/10.1145/383962.383969.
Butler W Lampson. How To Build A Highly Available System Using Consensus. In International Workshop on Distributed Algorithms, pages 1-17. Springer, 1996.
Barbara Liskov and James Cowling. Viewstamped replication revisited. Technical Report MIT-CSAIL-TR-2012-021, MIT, July 2012.
Shengyun Liu, Paolo Viotti, Christian Cachin, Vivien Quéma, and Marko Vukolic. XFT: Practical Fault Tolerance Beyond Crashes. In OSDI, pages 485-500, 2016.
James Lovejoy, Madars Virza, Cory Fields, Kevin Karwaski, Anders Brownworth, and Neha Narula. Hamilton: A High-Performance Transaction Processor For Central Bank Digital Currencies. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23), pages 901-915, 2023.
Nancy A Lynch. Distributed Algorithms. Elsevier, 1996.
Hein Meling and Leander Jehl. Tutorial Summary: Paxos Explained from Scratch. In International Conference On Principles Of Distributed Systems, pages 1-10. Springer, 2013.
mit dci. Opencbdc-tctl. https://github.com/mit-dci/opencbdc-tctl, 2022. Accessed on April 19, 2023.
Joachim Neu, Ertem Nusret Tas, and David Tse. The Availability-accountability Dilemma and Its Resolution via Accountability Gadgets. In International Conference on Financial Cryptography and Data Security, pages 541-559. Springer, 2022.
Joachim Neu, Ertem Nusret Tas, and David Tse. Accountable Safety Implies Finality. arXiv preprint arXiv:2308.16902, 2023.
Diego Ongaro and John Ousterhout. In Search of An Understandable Consensus Algorithm. In 2014 USENIX Annual Technical Conference (USENIXATC 14), pages 305-319, 2014.
Mohammad Roohitavaf, Jung-Sang Ahn, Woon-Hak Kang, Kun Ren, Gene Zhang, Sami Ben-Romdhane, and Sandeep S Kulkarni. Session Guarantees with Raft and Hybrid Logical Clocks. In Proceedings of the 20th International Conference on Distributed Computing and Networking, pages 100-109, 2019.
Ermin Sakic and Wolfgang Kellerer. Response Time and Availability Study of RAFT Consensus in Distributed SDN Control Plane. IEEE Transactions on Network and Service Management, 15(1):304-318, 2017.
Peiyao Sheng, Gerui Wang, Kartik Nayak, Sreeram Kannan, and Pramod Viswanath. BFT Protocol Forensics. In Proceedings of the 2021 ACM SIGSAC conference on computer and communications security, pages 1722-1743, 2021.
Peiyao Sheng, Gerui Wang, Kartik Nayak, Sreeram Kannan, and Pramod Viswanath. Player-replaceability and Forensic Support Are Two Sides of the Same (crypto) Coin. Cryptology ePrint Archive, 2022.
simplespy. DiemForensics. https://github.com/simplespy/DiemForensics, 2020. Accessed on April 19, 2023.
Swaminathan Sivasubramanian. Amazon DynamoDB: A Seamlessly Scalable Non-relational Database Service. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pages 729-730, 2012.
Alistair Stewart and Eleftherios Kokoris-Kogia. GRANDPA: A Byzantine Finality Gadget. arXiv preprint arXiv:2007.01560, 2020.
Rebecca Taft, Irfan Sharif, Andrei Matei, Nathan VanBenschoten, Jordan Lewis, Tobias Grieger, Kai Niemi, Andy Woods, Anne Birzin, Raphael Poss, et al. Cockroachdb: The Resilient Geo-distributed Sql Database. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pages 1493-1509, 2020.
Dezhi Tan, Jianguo Hu, and Jun Wang. VBBFT-Raft: An Understandable Blockchain Consensus Protocol with High Performance. In 2019 IEEE 7th International Conference on Computer Science and Network Technology (ICCSNT), pages 111-115, 2019. URL: https://doi.org/10.1109/ICCSNT47585.2019.8962479.
Weizhao Tang, Peiyao Sheng, Ronghao Ni, Pronoy Roy, Xuechao Wang, Giulia Fanti, and Pramod Viswanath. Cft-forensics: High-performance byzantine accountability for crash fault tolerant protocols, 2024. URL: https://arxiv.org/abs/2305.09123.
Robbert Van Renesse and Deniz Altinbuken. Paxos Made Moderately Complex. ACM Computing Surveys (CSUR), 47(3):1-36, 2015.
Jun Wan, Atsuki Momose, Ling Ren, Elaine Shi, and Zhuolun Xiang. On the amortized communication complexity of byzantine broadcast. In Proceedings of the 2023 ACM Symposium on Principles of Distributed Computing, PODC '23, pages 253-261, New York, NY, USA, 2023. Association for Computing Machinery. URL: https://doi.org/10.1145/3583668.3594596.
Zhou Wang, Zhang and Xu. A Byzantine Fault Tolerance Raft Algorithm Combines with BLS Signature. Journal of Applied Sciences, 38(1):93, 2020. URL: https://doi.org/10.3969/j.issn.0255-8297.2020.01.007.
Dr. Gavin Wood. Ethereum: A Secure Decentralised Generalised Transaction Ledger (Paris Version). https://ethereum.github.io/yellowpaper/paper.pdf, March 2024. (Accessed on 05/22/2024).
Maofan Yin, Dahlia Malkhi, Michael K Reiter, Guy Golan Gueta, and Ittai Abraham. HotStuff: BFT Consensus with Linearity and Responsiveness. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, pages 347-356, 2019.

CFT-Forensics: High-Performance Byzantine Accountability for Crash Fault Tolerant Protocols

Authors Weizhao Tang , Peiyao Sheng , Ronghao Ni , Pronoy Roy, Xuechao Wang , Giulia Fanti , Pramod Viswanath

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message

CFT-Forensics: High-Performance Byzantine Accountability for Crash Fault Tolerant Protocols

Authors Weizhao Tang , Peiyao Sheng , Ronghao Ni , Pronoy Roy, Xuechao Wang , Giulia Fanti , Pramod Viswanath

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

Acknowledgements

Supplementary Materials

References

Thanks for your feedback!

Could not send message