Search Results

Documents authored by Dhoked, Sahil


Document
RMR-Efficient Detectable Objects for Persistent Memory and Their Applications

Authors: Sahil Dhoked, Ahmed Fahmy, Wojciech Golab, and Neeraj Mittal

Published in: LIPIcs, Volume 324, 28th International Conference on Principles of Distributed Systems (OPODIS 2024)


Abstract
We describe a novel construction of arbitrary read-modify-write (RMW) primitives in a persistent shared memory model with process failures. Our construction uses blocking synchronization, in the form of recoverable mutual exclusion (RME), and is optimal in terms of the widely studied remote memory reference (RMR) complexity measure. The implemented objects tolerate either system-wide or independent process crashes, depending on the RME lock used, and also provide detectability for resolving the outcome of operations interrupted by failures. We prove that our construction is RMR-optimal using a reduction back to the RME problem. Our proof technique introduces a novel algorithmic style that enables solving challenging synchronization problems using a common execution path for both the system-wide and independent failure models, which previously required separate analyses, and relies only on a suitable implementation of the detectable base objects in each model to achieve RMR efficiency. Experiments demonstrate that our construction outperforms prior wait-free and lock-free algorithms on a multiprocessor with Intel Optane persistent memory.

Cite as

Sahil Dhoked, Ahmed Fahmy, Wojciech Golab, and Neeraj Mittal. RMR-Efficient Detectable Objects for Persistent Memory and Their Applications. In 28th International Conference on Principles of Distributed Systems (OPODIS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 324, pp. 5:1-5:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{dhoked_et_al:LIPIcs.OPODIS.2024.5,
  author =	{Dhoked, Sahil and Fahmy, Ahmed and Golab, Wojciech and Mittal, Neeraj},
  title =	{{RMR-Efficient Detectable Objects for Persistent Memory and Their Applications}},
  booktitle =	{28th International Conference on Principles of Distributed Systems (OPODIS 2024)},
  pages =	{5:1--5:26},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-360-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{324},
  editor =	{Bonomi, Silvia and Galletta, Letterio and Rivi\`{e}re, Etienne and Schiavoni, Valerio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.OPODIS.2024.5},
  URN =		{urn:nbn:de:0030-drops-225417},
  doi =		{10.4230/LIPIcs.OPODIS.2024.5},
  annote =	{Keywords: persistent memory, synchronization, recoverability, fault tolerance, detectability, scalability, RMR complexity, theory, mutual exclusion}
}
Document
Modular Recoverable Mutual Exclusion Under System-Wide Failures

Authors: Sahil Dhoked, Wojciech Golab, and Neeraj Mittal

Published in: LIPIcs, Volume 281, 37th International Symposium on Distributed Computing (DISC 2023)


Abstract
Recoverable mutual exclusion (RME) is a fault-tolerant variation of Dijkstra’s classic mutual exclusion (ME) problem that allows processes to fail by crashing as long as they recover eventually. A growing body of literature on this topic, starting with the problem formulation by Golab and Ramaraju (PODC'16), examines the cost of solving the RME problem, which is quantified by counting the expensive shared memory operations called remote memory references (RMRs), under a variety of conditions. Published results show that the RMR complexity of RME algorithms, among other factors, depends crucially on the failure model used: individual process versus system-wide. Recent work by Golab and Hendler (PODC'18) also suggests that explicit failure detection can be helpful in attaining constant RMR solutions to the RME problem in the system-wide failure model. Follow-up work by Jayanti, Jayanti, and Joshi (SPAA'23) shows that such a solution exists even without employing a failure detector, albeit this solution uses a more complex algorithmic approach. In this work, we dive deeper into the study of RMR-optimal RME algorithms for the system-wide failure model, and present contributions along multiple directions. First, we introduce the notion of withdrawing from a lock acquisition rather than resetting the lock. We use this notion to design a withdrawable RME algorithm with optimal O(1) RMR complexity for both cache-coherent (CC) and distributed shared memory (DSM) models in a modular way without using an explicit failure detector. In some sense, our technique marries the simplicity of Golab and Hendler’s algorithm with Jayanti, Jayanti and Joshi’s weaker system model. Second, we present a variation of our algorithm that supports fully dynamic process participation (i.e., both joining and leaving) in the CC model, while maintaining its constant RMR complexity. We show experimentally that our algorithm is substantially faster than Jayanti, Jayanti, and Joshi’s algorithm despite having stronger correctness properties. Finally, we establish an impossibility result for fully dynamic RME algorithms with bounded RMR complexity in the DSM model that are adaptive with respect to space, and provide a wait-free withdraw section.

Cite as

Sahil Dhoked, Wojciech Golab, and Neeraj Mittal. Modular Recoverable Mutual Exclusion Under System-Wide Failures. In 37th International Symposium on Distributed Computing (DISC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 281, pp. 17:1-17:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{dhoked_et_al:LIPIcs.DISC.2023.17,
  author =	{Dhoked, Sahil and Golab, Wojciech and Mittal, Neeraj},
  title =	{{Modular Recoverable Mutual Exclusion Under System-Wide Failures}},
  booktitle =	{37th International Symposium on Distributed Computing (DISC 2023)},
  pages =	{17:1--17:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-301-0},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{281},
  editor =	{Oshman, Rotem},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DISC.2023.17},
  URN =		{urn:nbn:de:0030-drops-191431},
  doi =		{10.4230/LIPIcs.DISC.2023.17},
  annote =	{Keywords: mutual exclusion, shared memory, persistent memory, fault tolerance, system-wide failure, RMR complexity, dynamic joining, dynamic leaving}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail