Design and Analysis of a Logless Dynamic Reconfiguration Protocol

Authors William Schultz, Siyuan Zhou, Ian Dardik, Stavros Tripakis



PDF
Thumbnail PDF

File

LIPIcs.OPODIS.2021.26.pdf
  • Filesize: 0.87 MB
  • 16 pages

Document Identifiers

Author Details

William Schultz
  • Northeastern University, Boston, MA, USA
Siyuan Zhou
  • MongoDB, New York, NY, USA
Ian Dardik
  • Northeastern University, Boston, MA, USA
Stavros Tripakis
  • Northeastern University, Boston, MA,USA

Acknowledgements

We would like to thank Tess Avitabile for her critical insights during the development of the reconfiguration protocol and discovery of subtle bugs in early design proposals. We would like to thank Judah Schvimer, A. Jesse Jiryu Davis, Pavi Vetriselvan, and Ali Mir for offering helpful insights during the protocol design and implementation process. We would also like to thank Shuai Mu for providing helpful comments on initial drafts of this paper.

Cite As Get BibTex

William Schultz, Siyuan Zhou, Ian Dardik, and Stavros Tripakis. Design and Analysis of a Logless Dynamic Reconfiguration Protocol. In 25th International Conference on Principles of Distributed Systems (OPODIS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 217, pp. 26:1-26:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022) https://doi.org/10.4230/LIPIcs.OPODIS.2021.26

Abstract

Distributed replication systems based on the replicated state machine model have become ubiquitous as the foundation of modern database systems. To ensure availability in the presence of faults, these systems must be able to dynamically replace failed nodes with healthy ones via dynamic reconfiguration. MongoDB is a document oriented database with a distributed replication mechanism derived from the Raft protocol. In this paper, we present MongoRaftReconfig, a novel dynamic reconfiguration protocol for the MongoDB replication system. MongoRaftReconfig utilizes a logless approach to managing configuration state and decouples the processing of configuration changes from the main database operation log. The protocol’s design was influenced by engineering constraints faced when attempting to redesign an unsafe, legacy reconfiguration mechanism that existed previously in MongoDB. We provide a safety proof of MongoRaftReconfig, along with a formal specification in TLA+. To our knowledge, this is the first published safety proof and formal specification of a reconfiguration protocol for a Raft-based system. We also present results from model checking the safety properties of MongoRaftReconfig on finite protocol instances. Finally, we discuss the conceptual novelties of MongoRaftReconfig, how it can be understood as an optimized and generalized version of the single server reconfiguration algorithm of Raft, and present an experimental evaluation of how its optimizations can provide performance benefits for reconfigurations.

Subject Classification

ACM Subject Classification
  • Information systems → Parallel and distributed DBMSs
  • Software and its engineering → Software verification
Keywords
  • Fault Tolerance
  • Dynamic Reconfiguration
  • State Machine Replication

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Marcos Aguilera, Idit Keidar, Dahlia Malkhi, Jean-Philippe Martin, and Alexander Shraer. Reconfiguring Replicated Atomic Storage: A Tutorial. Bulletin of the European Association for Theoretical Computer Science EATCS, 2010. Google Scholar
  2. Brandon Amos and Huanchen Zhang. Specifying and proving cluster membership for the Raft distributed consensus algorithm, 2015. URL: https://www.cs.cmu.edu/~aplatzer/course/pls15/projects/bamos.pdf.
  3. Yves Bertot and Pierre Castéran. Interactive theorem proving and program development: Coq’Art: the calculus of inductive constructions. Springer Science & Business Media, 2013. Google Scholar
  4. Tushar D Chandra, Robert Griesemer, and Joshua Redstone. Paxos Made Live: An Engineering Perspective. In Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing, PODC '07, pages 398-407, New York, NY, USA, 2007. Association for Computing Machinery. URL: https://doi.org/10.1145/1281100.1281103.
  5. Kaustuv Chaudhuri, Damien Doligez, Leslie Lamport, and Stephan Merz. Verifying safety properties with the tla+ proof system. In International Joint Conference on Automated Reasoning, pages 142-148. Springer, 2010. Google Scholar
  6. Edmund M Clarke, William Klieber, Miloš Nováček, and Paolo Zuliani. Model checking and the state explosion problem. In LASER Summer School on Software Engineering, pages 1-30. Springer, 2011. Google Scholar
  7. James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford. Spanner: Google’s globally-distributed database. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, 2012. URL: https://doi.org/10.1145/2518037.2491245.
  8. Seth Gilbert, Nancy A. Lynch, and Alexander A. Shvartsman. Rambo: A robust, reconfigurable atomic memory service for dynamic networks. Distributed Computing, 2010. URL: https://doi.org/10.1007/s00446-010-0117-1.
  9. Dongxu Huang, Qi Liu, Qiu Cui, Zhuhe Fang, Xiaoyu Ma, Fei Xu, Li Shen, Liu Tang, Yuxing Zhou, Menglong Huang, Wan Wei, Cong Liu, Jian Zhang, Jianjun Li, Xuelian Wu, Lingyu Song, Ruoxi Sun, Shuaipeng Yu, Lei Zhao, Nicholas Cameron, Liquan Pei, and Xin Tang. TiDB: a Raft-based HTAP database. Proceedings of the VLDB Endowment, 2020. URL: https://doi.org/10.14778/3415478.3415535.
  10. Leander Jehl and Hein Meling. Asynchronous reconfiguration for Paxos state machines. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014. URL: https://doi.org/10.1007/978-3-642-45249-9_8.
  11. Petr Kuznetsov and Andrei Tonkikh. Asynchronous Reconfiguration with Byzantine Failures. In Hagit Attiya, editor, 34th International Symposium on Distributed Computing (DISC 2020), volume 179 of Leibniz International Proceedings in Informatics (LIPIcs), pages 27:1-27:17, Dagstuhl, Germany, 2020. Schloss Dagstuhl-Leibniz-Zentrum für Informatik. URL: https://doi.org/10.4230/LIPIcs.DISC.2020.27.
  12. Leslie Lamport. The Part-Time Parliament. ACM Transactions on Computer Systems, 1998. URL: https://doi.org/10.1145/279227.279229.
  13. Leslie Lamport. Paxos Made Simple. ACM SIGACT News, 2001. URL: https://doi.org/10.1145/568425.568433.
  14. Leslie Lamport, Dahlia Malkhi, and Lidong Zhou. Stoppable paxos. TechReport, Microsoft Research, 2008. Google Scholar
  15. Leslie Lamport, Dahlia Malkhi, and Lidong Zhou. Vertical Paxos and Primary-Backup Replication. In Proceedings of the 28th ACM Symposium on Principles of Distributed Computing, PODC '09, pages 312-313, New York, NY, USA, 2009. Association for Computing Machinery. URL: https://doi.org/10.1145/1582716.1582783.
  16. Stephan Merz. The Specification Language TLA+, pages 401-451. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008. URL: https://doi.org/10.1007/978-3-540-74107-7_8.
  17. MongoDB Github Project, 2021. URL: https://github.com/mongodb/mongo.
  18. MongoDB JIRA Ticket SERVER-46907, 2020. URL: https://jira.mongodb.org/browse/SERVER-46907.
  19. Diego Ongaro. Consensus: Bridging Theory and Practice. Doctoral thesis, 2014. Google Scholar
  20. Diego Ongaro. Bug in single-server membership changes, July 2015. URL: https://groups.google.com/g/raft-dev/c/t4xj6dJTP6E/m/d2D9LrWRza8J.
  21. Diego Ongaro. The Raft Consensus Algorithm, 2021. URL: https://raft.github.io/.
  22. Diego Ongaro and John Ousterhout. In Search of an Understandable Consensus Algorithm. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference, USENIX ATC'14, pages 305-320, USA, 2014. USENIX Association. Google Scholar
  23. Denis Rystsov. CASPaxos: Replicated State Machines without logs, 2018. URL: http://arxiv.org/abs/1802.07000.
  24. Fred B. Schneider. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial. ACM Computing Surveys (CSUR), 1990. URL: https://doi.org/10.1145/98163.98167.
  25. William Schultz. MongoDB Experiment Source Code, September 2021. URL: https://github.com/will62794/mongo/tree/2bb9f30da.
  26. William Schultz. MongoRaftReconfig TLA+ Specifications, November 2021. URL: https://doi.org/10.5281/zenodo.5715511.
  27. William Schultz, Tess Avitabile, and Alyson Cabral. Tunable Consistency in MongoDB. Proc. VLDB Endow., 12(12):2071-2081, August 2019. URL: https://doi.org/10.14778/3352063.3352125.
  28. William Schultz, Siyuan Zhou, Ian Dardik, and Stavros Tripakis. Design and Analysis of a Logless Dynamic Reconfiguration Protocol. arXiv preprint, 2021. URL: http://arxiv.org/abs/2102.11960.
  29. Alexander Shraer, Benjamin Reed, Dahlia Malkhi, and Flavio Junqueira. Dynamic reconfiguration of primary/backup clusters. In Proceedings of the 2012 USENIX Annual Technical Conference, USENIX ATC 2012, 2019. Google Scholar
  30. Rebecca Taft, Irfan Sharif, Andrei Matei, Nathan VanBenschoten, Jordan Lewis, Tobias Grieger, Kai Niemi, Andy Woods, Anne Birzin, Raphael Poss, Paul Bardea, Amruta Ranade, Ben Darnell, Bram Gruneir, Justin Jaffray, Lucy Zhang, and Peter Mattis. CockroachDB: The Resilient Geo-Distributed SQL Database. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, SIGMOD '20, pages 1493-1509, New York, NY, USA, 2020. Association for Computing Machinery. URL: https://doi.org/10.1145/3318464.3386134.
  31. Michael Whittaker, Neil Giridharan, Adriana Szekeres, Joseph M Hellerstein, Heidi Howard, Faisal Nawab, and Ion Stoica. Matchmaker Paxos: A Reconfigurable Consensus Protocol [Technical Report], 2020. URL: http://arxiv.org/abs/2007.09468.
  32. Doug Woos, James R. Wilcox, Steve Anton, Zachary Tatlock, Michael D. Ernst, and Thomas Anderson. Planning for change in a formal verification of the raft consensus protocol. In CPP 2016 - Proceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs, co-located with POPL 2016, 2016. URL: https://doi.org/10.1145/2854065.2854081.
  33. Yuan Yu, Panagiotis Manolios, and Leslie Lamport. Model checking TLA+ specifications. In Advanced Research Working Conference on Correct Hardware Design and Verification Methods, pages 54-66. Springer, 1999. Google Scholar
  34. Siyuan Zhou and Shuai Mu. Fault-Tolerant Replication with Pull-Based Consensus in MongoDB. In NSDI, pages 687-703, 2021. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail