Search Results

Documents authored by Michael, Ellis


Document
Recovering Shared Objects Without Stable Storage

Authors: Ellis Michael, Dan R. K. Ports, Naveen Kr. Sharma, and Adriana Szekeres

Published in: LIPIcs, Volume 91, 31st International Symposium on Distributed Computing (DISC 2017)


Abstract
This paper considers the problem of building fault-tolerant shared objects when processes can crash and recover but lose their persistent state on recovery. This Diskless Crash-Recovery (DCR) model matches the way many long-lived systems are built. We show that it presents new challenges, as operations that are recorded at a quorum may not persist after some of the processes in that quorum crash and then recover. To address this problem, we introduce the notion of crash-consistent quorums, where no recoveries happen during the quorum responses. We show that relying on crash-consistent quorums enables a recovery procedure that can recover all operations that successfully finished. Crash-consistent quorums can be easily identified using a mechanism we term the crash vector, which tracks the causal relationship between crashes, recoveries, and other operations. We apply crash-consistent quorums and crash vectors to build two storage primitives. We give a new algorithm for multi-writer, multi-reader atomic registers in the DCR model that guarantees safety under all conditions and termination under a natural condition. It improves on the best prior protocol for this problem by requiring fewer rounds, fewer nodes to participate in the quorum, and a less restrictive liveness condition. We also present a more efficient single-writer, single-reader atomic set - a virtual stable storage abstraction. It can be used to lift any existing algorithm from the traditional Crash-Recovery model to the DCR model. We examine a specific application, state machine replication, and show that existing diskless protocols can violate their correctness guarantees, while ours offers a general and correct solution.

Cite as

Ellis Michael, Dan R. K. Ports, Naveen Kr. Sharma, and Adriana Szekeres. Recovering Shared Objects Without Stable Storage. In 31st International Symposium on Distributed Computing (DISC 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 91, pp. 36:1-36:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{michael_et_al:LIPIcs.DISC.2017.36,
  author =	{Michael, Ellis and Ports, Dan R. K. and Sharma, Naveen Kr. and Szekeres, Adriana},
  title =	{{Recovering Shared Objects Without Stable Storage}},
  booktitle =	{31st International Symposium on Distributed Computing (DISC 2017)},
  pages =	{36:1--36:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-053-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{91},
  editor =	{Richa, Andr\'{e}a},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DISC.2017.36},
  URN =		{urn:nbn:de:0030-drops-80055},
  doi =		{10.4230/LIPIcs.DISC.2017.36},
  annote =	{Keywords: asynchronous system, fault-tolerance, crash-recovery, R/W register, state machine replication}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail