Search Results

Documents authored by Dhoked, Sahil


Document
Modular Recoverable Mutual Exclusion Under System-Wide Failures

Authors: Sahil Dhoked, Wojciech Golab, and Neeraj Mittal

Published in: LIPIcs, Volume 281, 37th International Symposium on Distributed Computing (DISC 2023)


Abstract
Recoverable mutual exclusion (RME) is a fault-tolerant variation of Dijkstra’s classic mutual exclusion (ME) problem that allows processes to fail by crashing as long as they recover eventually. A growing body of literature on this topic, starting with the problem formulation by Golab and Ramaraju (PODC'16), examines the cost of solving the RME problem, which is quantified by counting the expensive shared memory operations called remote memory references (RMRs), under a variety of conditions. Published results show that the RMR complexity of RME algorithms, among other factors, depends crucially on the failure model used: individual process versus system-wide. Recent work by Golab and Hendler (PODC'18) also suggests that explicit failure detection can be helpful in attaining constant RMR solutions to the RME problem in the system-wide failure model. Follow-up work by Jayanti, Jayanti, and Joshi (SPAA'23) shows that such a solution exists even without employing a failure detector, albeit this solution uses a more complex algorithmic approach. In this work, we dive deeper into the study of RMR-optimal RME algorithms for the system-wide failure model, and present contributions along multiple directions. First, we introduce the notion of withdrawing from a lock acquisition rather than resetting the lock. We use this notion to design a withdrawable RME algorithm with optimal O(1) RMR complexity for both cache-coherent (CC) and distributed shared memory (DSM) models in a modular way without using an explicit failure detector. In some sense, our technique marries the simplicity of Golab and Hendler’s algorithm with Jayanti, Jayanti and Joshi’s weaker system model. Second, we present a variation of our algorithm that supports fully dynamic process participation (i.e., both joining and leaving) in the CC model, while maintaining its constant RMR complexity. We show experimentally that our algorithm is substantially faster than Jayanti, Jayanti, and Joshi’s algorithm despite having stronger correctness properties. Finally, we establish an impossibility result for fully dynamic RME algorithms with bounded RMR complexity in the DSM model that are adaptive with respect to space, and provide a wait-free withdraw section.

Cite as

Sahil Dhoked, Wojciech Golab, and Neeraj Mittal. Modular Recoverable Mutual Exclusion Under System-Wide Failures. In 37th International Symposium on Distributed Computing (DISC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 281, pp. 17:1-17:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{dhoked_et_al:LIPIcs.DISC.2023.17,
  author =	{Dhoked, Sahil and Golab, Wojciech and Mittal, Neeraj},
  title =	{{Modular Recoverable Mutual Exclusion Under System-Wide Failures}},
  booktitle =	{37th International Symposium on Distributed Computing (DISC 2023)},
  pages =	{17:1--17:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-301-0},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{281},
  editor =	{Oshman, Rotem},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DISC.2023.17},
  URN =		{urn:nbn:de:0030-drops-191431},
  doi =		{10.4230/LIPIcs.DISC.2023.17},
  annote =	{Keywords: mutual exclusion, shared memory, persistent memory, fault tolerance, system-wide failure, RMR complexity, dynamic joining, dynamic leaving}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail