RMR-Efficient Detectable Objects for Persistent Memory and Their Applications

Authors Sahil Dhoked , Ahmed Fahmy , Wojciech Golab , Neeraj Mittal



PDF
Thumbnail PDF

File

LIPIcs.OPODIS.2024.5.pdf
  • Filesize: 1.02 MB
  • 26 pages

Document Identifiers

Author Details

Sahil Dhoked
  • Department of Computer Science, The University of Texas at Dallas, TX, USA
Ahmed Fahmy
  • Department of Electrical and Computer Engineering, University of Waterloo, Canada
Wojciech Golab
  • Department of Electrical and Computer Engineering, University of Waterloo, Canada
Neeraj Mittal
  • Department of Computer Science, The University of Texas at Dallas, TX, USA

Acknowledgements

We are grateful to the anonymous reviewers for their valuable feedback.

Cite As Get BibTex

Sahil Dhoked, Ahmed Fahmy, Wojciech Golab, and Neeraj Mittal. RMR-Efficient Detectable Objects for Persistent Memory and Their Applications. In 28th International Conference on Principles of Distributed Systems (OPODIS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 324, pp. 5:1-5:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/LIPIcs.OPODIS.2024.5

Abstract

We describe a novel construction of arbitrary read-modify-write (RMW) primitives in a persistent shared memory model with process failures. Our construction uses blocking synchronization, in the form of recoverable mutual exclusion (RME), and is optimal in terms of the widely studied remote memory reference (RMR) complexity measure. The implemented objects tolerate either system-wide or independent process crashes, depending on the RME lock used, and also provide detectability for resolving the outcome of operations interrupted by failures. We prove that our construction is RMR-optimal using a reduction back to the RME problem. Our proof technique introduces a novel algorithmic style that enables solving challenging synchronization problems using a common execution path for both the system-wide and independent failure models, which previously required separate analyses, and relies only on a suitable implementation of the detectable base objects in each model to achieve RMR efficiency. Experiments demonstrate that our construction outperforms prior wait-free and lock-free algorithms on a multiprocessor with Intel Optane persistent memory.

Subject Classification

ACM Subject Classification
  • Theory of computation → Concurrent algorithms
Keywords
  • persistent memory
  • synchronization
  • recoverability
  • fault tolerance
  • detectability
  • scalability
  • RMR complexity
  • theory
  • mutual exclusion

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Marcos K. Aguilera and S. Frølund. Strict linearizability and the power of aborting. Technical Report HPL-2003-241, Hewlett-Packard Labs, 2003. Google Scholar
  2. Hagit Attiya, Ohad Ben-Baruch, Panagiota Fatourou, Danny Hendler, and Eleftherios Kosmas. Tracking in order to recover - detectable recovery of lock-free data structures. In Proc. of the 32nd ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 503-505, 2020. URL: https://doi.org/10.1145/3350755.3400257.
  3. Hagit Attiya, Ohad Ben-Baruch, and Danny Hendler. Nesting-safe recoverable linearizability: Modular constructions for non-volatile memory. In Proc. of the 37th ACM Symposium on Principles of Distributed Computing (PODC), pages 7-16, 2018. URL: https://doi.org/10.1145/3212734.3212753.
  4. Hagit Attiya, Danny Hendler, and Philipp Woelfel. Tight RMR lower bounds for mutual exclusion and other problems. In Proc. of the 40th ACM Symposium on Theory of Computing (STOC), pages 217-226, 2008. URL: https://doi.org/10.1145/1374376.1374410.
  5. Ohad Ben-Baruch, Danny Hendler, and Matan Rusanovsky. Upper and lower bounds on the space complexity of detectable objects. In Proc. of the 39th ACM Symposium on Principles of Distributed Computing (PODC), pages 11-20, 2020. URL: https://doi.org/10.1145/3382734.3405725.
  6. Naama Ben-David, Guy E. Blelloch, Michal Friedman, and Yuanhao Wei. Delay-free concurrency on faulty persistent memory. In Proc. of the 31st ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 253-264, 2019. URL: https://doi.org/10.1145/3323165.3323187.
  7. Ryan Berryhill, Wojciech Golab, and Mahesh Tripunitara. Robust shared objects for non-volatile main memory. In Proc. of the 19th International Conference on Principles of Distributed Systems (OPODIS), pages 20:1-20:17, 2016. Google Scholar
  8. Trevor Brown and Hillel Avni. Phytm: Persistent hybrid transactional memory. Proc. VLDB Endow., 10(4):409-420, 2016. URL: https://doi.org/10.14778/3025111.3025122.
  9. David Yu Cheng Chan, George Giakkoupis, and Philipp Woelfel. Word-size RMR tradeoffs for recoverable mutual exclusion. In Proc. of the 42th ACM Symposium on Principles of Distributed Computing (PODC), pages 79-89, 2023. URL: https://doi.org/10.1145/3583668.3594597.
  10. David Yu Cheng Chan and Philipp Woelfel. Tight lower bound for the RMR complexity of recoverable mutual exclusion. In Proc. of the 40th ACM Symposium on Principles of Distributed Computing (PODC), pages 533-543, 2021. URL: https://doi.org/10.1145/3465084.3467938.
  11. Joel Coburn, Adrian M. Caulfield, Ameen Akel, Laura M. Grupp, Rajesh K. Gupta, Ranjit Jhala, and Steven Swanson. Nv-heaps: making persistent objects fast and safe with next-generation, non-volatile memories. In Proc. of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 105-118, 2011. URL: https://doi.org/10.1145/1950365.1950380.
  12. Jeremy Condit, Edmund B. Nightingale, Christopher Frost, Engin Ipek, Benjamin C. Lee, Doug Burger, and Derrick Coetzee. Better I/O through byte-addressable, persistent memory. In Proc. of the 22nd ACM Symposium on Operating Systems Principles (SOSP), pages 133-146, 2009. URL: https://doi.org/10.1145/1629575.1629589.
  13. Sahil Dhoked, Wojciech Golab, and Neeraj Mittal. Modular recoverable mutual exclusion under system-wide failures. In Proc. of the 37th International Symposium on Distributed Computing (DISC), pages 17:1-17:24, 2023. URL: https://doi.org/10.4230/LIPICS.DISC.2023.17.
  14. Sahil Dhoked, Wojciech Golab, and Neeraj Mittal. Modular recoverable mutual exclusion under system-wide failures. In Proc. of the 37th International Symposium on Distributed Computing (DISC), pages 17:1-17:24, 2023. URL: https://doi.org/10.4230/LIPICS.DISC.2023.17.
  15. Sahil Dhoked and Neeraj Mittal. An adaptive approach to recoverable mutual exclusion. In Proc. of the 39th ACM Symposium on Principles of Distributed Computing (PODC), pages 1-10, New York, NY, USA, 2020. URL: https://doi.org/10.1145/3382734.3405739.
  16. Edsger W. Dijkstra. Solution of a problem in concurrent programming control. Communications of the ACM, 8(9):569, 1965. URL: https://doi.org/10.1145/365559.365617.
  17. Michal Friedman, Naama Ben-David, Yuanhao Wei, Guy E. Blelloch, and Erez Petrank. Nvtraverse: in NVRAM data structures, the destination is more important than the journey. In Proc. of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI), pages 377-392, 2020. URL: https://doi.org/10.1145/3385412.3386031.
  18. Michal Friedman, Maurice Herlihy, Virendra J. Marathe, and Erez Petrank. Brief announcement: A persistent lock-free queue for non-volatile memory. In Proc. of the 31st International Symposium on Distributed Computing (DISC), volume 91, pages 50:1-50:4, 2017. URL: https://doi.org/10.4230/LIPICS.DISC.2017.50.
  19. Michal Friedman, Maurice Herlihy, Virendra J. Marathe, and Erez Petrank. A persistent lock-free queue for non-volatile memory. In Proc. of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 28-40, 2018. URL: https://doi.org/10.1145/3178487.3178490.
  20. Wojciech Golab. The recoverable consensus hierarchy. In Proc. of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 281-291, 2020. URL: https://doi.org/10.1145/3350755.3400212.
  21. Wojciech Golab, Vassos Hadzilacos, Danny Hendler, and Philipp Woelfel. RMR-efficient implementations of comparison primitives using read and write operations. Distributed Computing, 25(2):109-162, 2012. URL: https://doi.org/10.1007/S00446-011-0150-8.
  22. Wojciech Golab and Danny Hendler. Recoverable mutual exclusion in sub-logarithmic time. In Proc. of the 36th ACM Symposium on Principles of Distributed Computing (PODC), pages 211-220, 2017. URL: https://doi.org/10.1145/3087801.3087819.
  23. Wojciech Golab and Danny Hendler. Recoverable mutual exclusion under system-wide failures. In Proc. of the 37th ACM Symposium on Principles of Distributed Computing (PODC), pages 17-26, 2018. URL: https://doi.org/10.1145/3212734.3212755.
  24. Wojciech Golab and Aditya Ramaraju. Recoverable mutual exclusion. In Proc. of the 35th ACM Symposium on Principles of Distributed Computing (PODC), pages 65-74, 2016. Google Scholar
  25. Wojciech Golab and Aditya Ramaraju. Recoverable mutual exclusion. Distributed Computing, 32(6):535-564, 2019. URL: https://doi.org/10.1007/S00446-019-00364-0.
  26. Rachid Guerraoui and Ron R. Levy. Robust emulations of shared memory in a crash-recovery model. In Proc. of the 24th International Conference on Distributed Computing Systems (ICDCS), pages 400-407, 2004. URL: https://doi.org/10.1109/ICDCS.2004.1281605.
  27. M. Herlihy and J. M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3):463-492, July 1990. URL: https://doi.org/10.1145/78969.78972.
  28. Maurice Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1):124-149, 1991. URL: https://doi.org/10.1145/114005.102808.
  29. Maurice Herlihy and Jeannette M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3):463-492, 1990. URL: https://doi.org/10.1145/78969.78972.
  30. Joseph Izraelevitz, Terence Kelly, and Aasheesh Kolli. Failure-atomic persistent memory updates via JUSTDO logging. In Proc. of the 21s International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 427-442, 2016. URL: https://doi.org/10.1145/2872362.2872410.
  31. Joseph Izraelevitz, Hammurabi Mendes, and Michael L. Scott. Linearizability of persistent memory objects under a full-system-crash failure model. In Proc. of the 30th International Symposium on Distributed Computing (DISC), pages 313-327, 2016. URL: https://doi.org/10.1007/978-3-662-53426-7_23.
  32. Prasad Jayanti, Siddhartha Jayanti, and Sucharita Jayanti. Durable algorithms for writable LL/SC and CAS with dynamic joining. In Proc. of the 37th International Symposium on Distributed Computing (DISC), pages 25:1-25:20, 2023. URL: https://doi.org/10.4230/LIPICS.DISC.2023.25.
  33. Prasad Jayanti, Siddhartha Jayanti, and Anup Joshi. Constant RMR system-wide failure resilient durable locks with dynamic joining. In Proc. of the 35th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 227-237, 2023. URL: https://doi.org/10.1145/3558481.3591100.
  34. Prasad Jayanti, Siddhartha V. Jayanti, and Anup Joshi. A recoverable mutex algorithm with sub-logarithmic RMR on both CC and DSM. In Proc. of the 38th ACM Symposium on Principles of Distributed Computing (PODC), pages 177-186, 2019. URL: https://doi.org/10.1145/3293611.3331634.
  35. Daniel Katzan and Adam Morrison. Recoverable, abortable, and adaptive mutual exclusion with sublogarithmic RMR complexity. In Proc. of the 24th International Conference on Principles of Distributed Systems (OPODIS), pages 15:1-15:16, 2021. Google Scholar
  36. Tomer Lev Lehman, Hagit Attiya, and Danny Hendler. Recoverable and detectable self-implementations of swap. In Proc. of the 27th International Conference on Principles of Distributed Systems (OPODIS), pages 24:1-24:22, 2023. URL: https://doi.org/10.4230/LIPICS.OPODIS.2023.24.
  37. Nan Li and Wojciech Golab. Detectable sequential specifications for recoverable shared objects. In Proc. of the 35th International Symposium on Distributed Computing, volume 209 of DISC, pages 29:1-29:19, 2021. URL: https://doi.org/10.4230/LIPICS.DISC.2021.29.
  38. John M. Mellor-Crummey and Michael L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Transactions on Computer Systems, 9(1):21-65, 1991. URL: https://doi.org/10.1145/103727.103729.
  39. Mohammad Moridi, Erica Wang, Amelia Cui, and Wojciech M. Golab. A closer look at detectable objects for persistent memory. In Proc. of the Workshop on Advanced tools, programming languages, and PLatforms for Implementing and Evaluating algorithms for Distributed systems (ApPLIED), pages 56-64, 2022. URL: https://doi.org/10.1145/3524053.3542749.
  40. Liad Nahum, Hagit Attiya, Ohad Ben-Baruch, and Danny Hendler. Recoverable and detectable fetch&add. In Proc. of the 25th International Conference on Principles of Distributed Systems (OPODIS), pages 29:1-29:17, 2021. URL: https://doi.org/10.4230/LIPICS.OPODIS.2021.29.
  41. Dushyanth Narayanan and Orion Hodson. Whole-system persistence. In Proc. of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 401-410, 2012. URL: https://doi.org/10.1145/2150976.2151018.
  42. Steven Pelley, Peter M. Chen, and Thomas F. Wenisch. Memory persistency: Semantics for byte-addressable nonvolatile memory technologies. IEEE Micro, 35(3):125-131, 2015. URL: https://doi.org/10.1109/MM.2015.46.
  43. Andy Rudoff. cascade lake doesn't support clwb?, 2021. [last accessed 11/04/2024]. URL: https://groups.google.com/g/pmem/c/DRdYIc70RHc/m/rtoP681rAAAJ.
  44. Andy Rudoff and the Intel PMDK Team. Persistent memory development kit, 2020. [last accessed 2/11/2021]. URL: https://pmem.io/pmdk/.
  45. Matan Rusanovsky, Hagit Attiya, Ohad Ben-Baruch, Tom Gerby, Danny Hendler, and Pedro Ramalhete. Flat-combining-based persistent data structures for non-volatile memory. In Proc. of the 23rd International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS), pages 505-509, 2021. URL: https://doi.org/10.1007/978-3-030-91081-5_38.
  46. Haris Volos, Andres Jaan Tack, and Michael M. Swift. Mnemosyne: lightweight persistent memory. In Proc. of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 91-104, 2011. URL: https://doi.org/10.1145/1950365.1950379.
  47. Yuanhao Wei, Naama Ben-David, Michal Friedman, Guy E. Blelloch, and Erez Petrank. Flit: a library for simple and efficient persistent algorithms. In Proc. of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pages 309-321, 2022. URL: https://doi.org/10.1145/3503221.3508436.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail