Twins: BFT Systems Made Robust

Bano, Shehar; Sonnino, Alberto; Chursin, Andrey; Perelman, Dmitri; Li, Zekun; Ching, Avery; Malkhi, Dahlia

doi:10.4230/LIPIcs.OPODIS.2021.7

Abstract

This paper presents Twins, an automated unit test generator of Byzantine attacks. Twins implements three types of Byzantine behaviors: (i) leader equivocation, (ii) double voting, and (iii) losing internal state such as forgetting "locks" guarding voted values. To emulate interesting attacks by a Byzantine node, it instantiates twin copies of the node instead of one, giving both twins the same identities and network credentials. To the rest of the system, the twins appear indistinguishable from a single node behaving in a "questionable" manner. Twins can systematically generate Byzantine attack scenarios at scale, execute them in a controlled manner, and examine their behavior. Twins scenarios iterate over protocol rounds and vary the communication patterns among nodes. Twins runs in a production setting within DiemBFT where it can execute 44M Twins-generated scenarios daily. Whereas the system at hand did not manifest errors, subtle safety bugs that were deliberately injected for the purpose of validating the implementation of Twins itself were exposed within minutes. Twins can prevent developers from regressing correctness when updating the codebase, introducing new features, or performing routine maintenance tasks. Twins only requires a thin wrapper over DiemBFT, we thus envision other systems using it. Building on this idea, one new attack and several known attacks against other BFT protocols were materialized as Twins scenarios. In all cases, the target protocols break within fewer than a dozen protocol rounds, hence it is realistic for the Twins approach to expose the problems.

Cite As Get BibTex

Shehar Bano, Alberto Sonnino, Andrey Chursin, Dmitri Perelman, Zekun Li, Avery Ching, and Dahlia Malkhi. Twins: BFT Systems Made Robust. In 25th International Conference on Principles of Distributed Systems (OPODIS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 217, pp. 7:1-7:29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022) https://doi.org/10.4230/LIPIcs.OPODIS.2021.7

Author Details

Shehar Bano

Facebook Novi, London, UK

Alberto Sonnino

Facebook Novi, London, UK

Andrey Chursin

Facebook Novi, Menlo Park, CA, USA

Dmitri Perelman

Facebook Novi, Menlo Park, CA, USA

Zekun Li

Facebook Novi, Menlo Park, CA, USA

Avery Ching

Facebook Novi, Menlo Park, CA, USA

Dahlia Malkhi

Facebook Novi, Menlo Park, CA, USA

Funding

This work is funded by Novi, a subsidiary of Facebook.

Acknowledgements

The authors would like to thank Ben Maurer, David Dill, Daniel Xiang, Kartik Nayak, Ling Ren, and Scott Stoller for feedback on late manuscript, and George Danezis for comments on early manuscript. We also thank the Novi Research and Engineering teams for valuable feedback.

Supplementary Materials

All artifacts presented in this paper are made publicly available. Specifically, this includes: (i) the Rust implementation of LibTwins, the Twins framework we implemented for DiemBFT (Section 5); (ii) the artifacts (the AWS orchestration scripts, and microbenchmarking scripts and data) used to evaluate LibTwins (Section 6); and (iii) the Python simulator and Twins instantiation of safety flaw in Fast-HotStuff (Section 3).
Software (Source Code) https://github.com/asonnino/twins-simulator browse archived version
Software (Source Code) https://github.com/diem/diem browse archived version

References

Ittai Abraham, Guy Gueta, Dahlia Malkhi, and Jean-Philippe Martin. Revisiting Fast Practical Byzantine Fault Tolerance: Thelma, Velma, and Zelma. arXiv preprint arXiv:1801.10022, 2018.
Ittai Abraham, Dahlia Malkhi, Kartik Nayak, Ling Ren, and Maofan Yin. Sync HotStuff: Simple and Practical Synchronous State Machine Replication. In IEEE Symposium on Security and Privacy, 2020.
Peter Alvaro, Joshua Rosen, and Joseph M. Hellerstein. Lineage-Driven Fault Injection. In SIGMOD International Conference on Management of Data, 2015.
Inc. Amazon Web Services. AWS Whitepapers. https://aws.amazon.com/whitepapers, 2017.
Christel Baier and Joost-Pieter Katoen. Principles of Model Checking (Representation and Mind Series). The MIT Press, 2008.
Ethan Buchman. Tendermint: Byzantine Fault Tolerance in the Age of Blockchains. https://cdn.relayto.com/media/files/LPgoWO18TCeMIggJVakt_tendermint.pdf, 2016.
Ethan Buchman, Jae Kwon, and Zarko Milosevic. The Latest Gossip on BFT Consensus. arXiv preprint arXiv:1807.04938, 2018.
Vitalik Buterin and Virgil Griffith. Casper the Friendly Finality Gadget. arXiv preprint arXiv:1710.09437, 2017.
Miguel Castro and Barbara Liskov. Practical Byzantine Fault Tolerance. In USENIX Symposium on Operating Systems Design and Implementation, 1999.
Ang Chen, W Brad Moore, Hanjun Xiao, Andreas Haeberlen, Linh Thi Xuan Phan, Micah Sherr, and Wenchao Zhou. Detecting Covert Timing Channels with Time-Deterministic Replay. In USENIX Symposium on Operating Systems Design and Implementation, pages 541-554, 2014.
Ang Chen, Yang Wu, Andreas Haeberlen, Wenchao Zhou, and Boon Thau Loo. The Good, the Bad, and the Differences: Better Network Diagnostics with Differential Provenance. In ACM SIGCOMM Conference, 2016.
Cosmos. Cosmos Game of Stakes, 2018. URL: https://github.com/cosmos/game-of-stakes.
Diem. DiemBFTBFT. URL: https://github.com/diem/diem.
Patrice Godefroid, J. van Leeuwen, J. Hartmanis, G. Goos, and Pierre Wolper. Partial-Order Methods for the Verification of Concurrent Systems: An Approach to the State-Explosion Problem. Springer-Verlag, 1996.
Mohammad M Jalalzai, Jianyu Niu, and Chen Feng. Fast-hotstuff: A fast and resilient hotstuff protocol. arXiv preprint arXiv:2010.11454, 2020.
Jepsen. Distributed Systems Safety Research. URL: https://jepsen.io.
Ramakrishna Kotla, Lorenzo Alvisi, Mike Dahlin, Allen Clement, and Edmund Wong. Zyzzyva: Speculative Byzantine Fault Tolerance. In ACM SIGOPS Symposium on Operating Systems Principles, 2007.
Leslie Lamport. The Temporal Logic of Actions. ACM Transactions on Programming Languages and Systems, May 1994.
Leslie Lamport, R. Shostak, and M. Pease. The Byzantine Generals Problem. ACM Transactions on Programming Languages and Systems, 4:382-401, 1982.
Hyojeong Lee, Jeff Seibert, Endadul Hoque, Charles Killian, and Cristina Nita-Rotaru. Turret: A platform for automated attack finding in unmodified distributed system implementations. In 2014 IEEE 34th International Conference on Distributed Computing Systems, pages 660-669. IEEE, 2014.
Chia-Chi Lin, Virajith Jalaparti, Matthew Caesar, and Jacobus Van der Merwe. DEFINED: Deterministic Execution for Interactive Control-Plane Debugging. In USENIX Technical Conference, 2013.
J-P Martin and Lorenzo Alvisi. Fast Byzantine Consensus. IEEE Transactions on Dependable and Secure Computing, 3(3):202-215, 2006.
Atsuki Momose and Jason Paul Cruz. Force-Locking Attack on Sync Hotstuff. IACR Cryptology ePrint Archive, 2020.
Netflix. Chaos Monkey. URL: https://netflix.github.io/chaosmonkey/.
Filip Niksic. Combinatorial Constructions for Effective Testing. Doctoral thesis, Technische Universität Kaiserslautern, 2019.
Santhosh Prabhu, Kuan Yen Chou, Ali Kheradmand, Brighten Godfrey, and Matthew Caesar. Plankton: Scalable Network Configuration Verification Through Model Checking. In USENIX Symposium on Networked Systems Design and Implementation, 2020.
Basil Cameron Rennie and Annette Jane Dobson. On Stirling Numbers of the Second Kind. Journal of Combinatorial Theory, 7(2):116-121, 1969.
The Diem Team. State Machine Replication in the Libra Blockchain. https://developers.libra.org/docs/assets/papers/libra-consensus-state-machine-replication-in-the-libra-blockchain/2019-11-08.pdf, 2019.
Yang Wu, Mingchen Zhao, Andreas Haeberlen, Wenchao Zhou, and Boon Thau Loo. Diagnosing Missing Events in Distributed Systems with Negative Provenance. ACM SIGCOMM Computer Communication Review, 44(4):383-394, 2014.
Maysam Yabandeh, Nikola Knežević, Dejan Kostić, and Viktor Kuncak. Predicting and Preventing Inconsistencies in Deployed Distributed Systems. ACM Transactions on Computer Systems (TOCS), 28(1):1-49, 2010.
Maofan Yin, Dahlia Malkhi, Michael K Reiter, Guy Golan Gueta, and Ittai Abraham. Hotstuff: BFT Consensus in the Lens of Blockchain. arXiv preprint arXiv:1803.05069, 2018.
Maofan Yin, Dahlia Malkhi, Michael K Reiter, Guy Golan Gueta, and Ittai Abraham. Hotstuff: BFT Consensus with Linearity and Responsiveness. In ACM Symposium on Principles of Distributed Computing, 2019.

Twins: BFT Systems Made Robust

Authors Shehar Bano, Alberto Sonnino, Andrey Chursin, Dmitri Perelman, Zekun Li, Avery Ching, Dahlia Malkhi

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message