Search Results

Documents authored by Tripakis, Stavros


Document
Design and Analysis of a Logless Dynamic Reconfiguration Protocol

Authors: William Schultz, Siyuan Zhou, Ian Dardik, and Stavros Tripakis

Published in: LIPIcs, Volume 217, 25th International Conference on Principles of Distributed Systems (OPODIS 2021)


Abstract
Distributed replication systems based on the replicated state machine model have become ubiquitous as the foundation of modern database systems. To ensure availability in the presence of faults, these systems must be able to dynamically replace failed nodes with healthy ones via dynamic reconfiguration. MongoDB is a document oriented database with a distributed replication mechanism derived from the Raft protocol. In this paper, we present MongoRaftReconfig, a novel dynamic reconfiguration protocol for the MongoDB replication system. MongoRaftReconfig utilizes a logless approach to managing configuration state and decouples the processing of configuration changes from the main database operation log. The protocol’s design was influenced by engineering constraints faced when attempting to redesign an unsafe, legacy reconfiguration mechanism that existed previously in MongoDB. We provide a safety proof of MongoRaftReconfig, along with a formal specification in TLA+. To our knowledge, this is the first published safety proof and formal specification of a reconfiguration protocol for a Raft-based system. We also present results from model checking the safety properties of MongoRaftReconfig on finite protocol instances. Finally, we discuss the conceptual novelties of MongoRaftReconfig, how it can be understood as an optimized and generalized version of the single server reconfiguration algorithm of Raft, and present an experimental evaluation of how its optimizations can provide performance benefits for reconfigurations.

Cite as

William Schultz, Siyuan Zhou, Ian Dardik, and Stavros Tripakis. Design and Analysis of a Logless Dynamic Reconfiguration Protocol. In 25th International Conference on Principles of Distributed Systems (OPODIS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 217, pp. 26:1-26:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{schultz_et_al:LIPIcs.OPODIS.2021.26,
  author =	{Schultz, William and Zhou, Siyuan and Dardik, Ian and Tripakis, Stavros},
  title =	{{Design and Analysis of a Logless Dynamic Reconfiguration Protocol}},
  booktitle =	{25th International Conference on Principles of Distributed Systems (OPODIS 2021)},
  pages =	{26:1--26:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-219-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{217},
  editor =	{Bramas, Quentin and Gramoli, Vincent and Milani, Alessia},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.OPODIS.2021.26},
  URN =		{urn:nbn:de:0030-drops-158016},
  doi =		{10.4230/LIPIcs.OPODIS.2021.26},
  annote =	{Keywords: Fault Tolerance, Dynamic Reconfiguration, State Machine Replication}
}
Document
Brief Announcement
Brief Announcement: Design and Verification of a Logless Dynamic Reconfiguration Protocol in MongoDB Replication

Authors: William Schultz, Siyuan Zhou, and Stavros Tripakis

Published in: LIPIcs, Volume 209, 35th International Symposium on Distributed Computing (DISC 2021)


Abstract
We introduce a novel dynamic reconfiguration protocol for the MongoDB replication system that extends and generalizes the single server reconfiguration protocol of the Raft consensus algorithm. Our protocol decouples the processing of configuration changes from the main database operation log, which allows reconfigurations to proceed in cases when the main log is prevented from processing new operations. Additionally, this decoupling allows for configuration state to be managed by a logless replicated state machine, storing only the latest version of the configuration and avoiding the complexities of a log-based protocol. We present a formal specification of the protocol in TLA+, initial verification results of model checking its safety properties, and an experimental evaluation of how reconfigurations are able to quickly restore a system to healthy operation when node failures have stalled the main operation log. This announcement is a short version and the full paper is available at [Schultz et al., 2021].

Cite as

William Schultz, Siyuan Zhou, and Stavros Tripakis. Brief Announcement: Design and Verification of a Logless Dynamic Reconfiguration Protocol in MongoDB Replication. In 35th International Symposium on Distributed Computing (DISC 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 209, pp. 61:1-61:4, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{schultz_et_al:LIPIcs.DISC.2021.61,
  author =	{Schultz, William and Zhou, Siyuan and Tripakis, Stavros},
  title =	{{Brief Announcement: Design and Verification of a Logless Dynamic Reconfiguration Protocol in MongoDB Replication}},
  booktitle =	{35th International Symposium on Distributed Computing (DISC 2021)},
  pages =	{61:1--61:4},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-210-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{209},
  editor =	{Gilbert, Seth},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DISC.2021.61},
  URN =		{urn:nbn:de:0030-drops-148636},
  doi =		{10.4230/LIPIcs.DISC.2021.61},
  annote =	{Keywords: Reconfiguration, Consensus, State Machine Replication}
}
Document
Monitoring, Fault Diagnosis and Testing Real-time Systems using Analog and Digital Clocks

Authors: Stavros Tripakis

Published in: Dagstuhl Seminar Proceedings, Volume 7011, Runtime Verification (2008)


Abstract
We give an overview of known methods for monitoring, fault diagnosis and testing problems for real-time systems using timed automata as the main model. We present techniques for constructing monitors/diagnosers/testers with analog or digital clocks. We list a number of open problems in the field.

Cite as

Stavros Tripakis. Monitoring, Fault Diagnosis and Testing Real-time Systems using Analog and Digital Clocks. In Runtime Verification. Dagstuhl Seminar Proceedings, Volume 7011, pp. 1-2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)


Copy BibTex To Clipboard

@InProceedings{tripakis:DagSemProc.07011.3,
  author =	{Tripakis, Stavros},
  title =	{{Monitoring, Fault Diagnosis and Testing Real-time Systems using Analog and Digital Clocks}},
  booktitle =	{Runtime Verification},
  pages =	{1--2},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{7011},
  editor =	{Bernd Finkbeiner and Klaus Havelund and Grigore Rosu and Oleg Sokolsky},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.07011.3},
  URN =		{urn:nbn:de:0030-drops-13705},
  doi =		{10.4230/DagSemProc.07011.3},
  annote =	{Keywords: Monitoring, fault diagnosis, testing, timed automata}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail