Efficient Multi-Word Compare and Swap

Authors Rachid Guerraoui, Alex Kogan, Virendra J. Marathe, Igor Zablotchi



PDF
Thumbnail PDF

File

LIPIcs.DISC.2020.4.pdf
  • Filesize: 0.6 MB
  • 19 pages

Document Identifiers

Author Details

Rachid Guerraoui
  • EPFL, Lausanne, Switzerland
Alex Kogan
  • Oracle Labs, Burlington, MA, USA
Virendra J. Marathe
  • Oracle Labs, Burlinton, MA, USA
Igor Zablotchi
  • EPFL, Lausanne, Switzerland

Cite AsGet BibTex

Rachid Guerraoui, Alex Kogan, Virendra J. Marathe, and Igor Zablotchi. Efficient Multi-Word Compare and Swap. In 34th International Symposium on Distributed Computing (DISC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 179, pp. 4:1-4:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.DISC.2020.4

Abstract

Atomic lock-free multi-word compare-and-swap (MCAS) is a powerful tool for designing concurrent algorithms. Yet, its widespread usage has been limited because lock-free implementations of MCAS make heavy use of expensive compare-and-swap (CAS) instructions. Existing MCAS implementations indeed use at least 2k+1 CASes per k-CAS. This leads to the natural desire to minimize the number of CASes required to implement MCAS. We first prove in this paper that it is impossible to "pack" the information required to perform a k-word CAS (k-CAS) in less than k locations to be CASed. Then we present the first algorithm that requires k+1 CASes per call to k-CAS in the common uncontended case. We implement our algorithm and show that it outperforms a state-of-the-art baseline in a variety of benchmarks in most considered workloads. We also present a durably linearizable (persistent memory friendly) version of our MCAS algorithm using only 2 persistence fences per call, while still only requiring k+1 CASes per k-CAS.

Subject Classification

ACM Subject Classification
  • Theory of computation → Concurrent algorithms
Keywords
  • lock-free
  • multi-word compare-and-swap
  • persistent memory

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Yehuda Afek, Dave Dice, and Adam Morrison. Cache index-aware memory allocation. In Proceedings of the International Symposium on Memory Management (ISMM), page 55–64. Association for Computing Machinery, 2011. Google Scholar
  2. Marcos K Aguilera and Svend Frølund. Strict linearizability and the power of aborting. Technical Report HPL-2003-241, HP Labs, 2003. Google Scholar
  3. Dan Alistarh, William M. Leiserson, Alexander Matveev, and Nir Shavit. Forkscan: Conservative Memory Reclamation for Modern Operating Systems. In Proceedings of the Twelfth European Conference on Computer Systems, EuroSys 2017, pages 483-498, 2017. Google Scholar
  4. James H. Anderson and Mark Moir. Universal Constructions for Multi-object Operations. In 14th Annual ACM Symposium on Principles of Distributed Computing, pages 184-193, 1995. Google Scholar
  5. James H. Anderson, Srikanth Ramamurthy, and Rohit Jain. Implementing Wait-free Objects on Priority-based Systems. In 16th Annual ACM Symposium on Principles of Distributed Computing, pages 229-238, 1997. Google Scholar
  6. Joy Arulraj, Justin Levandoski, Umar Farooq Minhas, and Per-Ake Larson. BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory. In 44th International Conference on Very Large Data Bases, 2018. Google Scholar
  7. Hagit Attiya and Eshcar Hillel. Highly concurrent multi-word synchronization. Theor. Comput. Sci., 412(12-14):1243-1262, 2011. Google Scholar
  8. Ryan Berryhill, Wojciech M. Golab, and Mahesh Tripunitara. Robust shared objects for non-volatile main memory. In 19th International Conference on Principles of Distributed Systems, OPODIS 2015, pages 20:1-20:17, 2015. Google Scholar
  9. Anastasia Braginsky and Erez Petrank. A Lock-free B+Tree. In Proceedings of the Twenty-fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, pages 58-67, 2012. Google Scholar
  10. Trevor Alexander Brown. Reclaiming Memory for Lock-Free Data Structures: There Has to Be a Better Way. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, pages 261-270, 2015. Google Scholar
  11. Bztree: a high-performance latch-free range index for non-volatile memory. https://github.com/wangtzh/bztree, 2019.
  12. Diego Cepeda, Sakib Chowdhury, Nan Li, Raphael Lopez, Xinzhe Wang, and Wojciech Golab. Toward linearizability testing for multi-word persistent synchronization primitives. In 23rd International Conference on Principles of Distributed Systems, OPODIS 2019, volume 153, pages 19:1-19:17, 2019. Google Scholar
  13. Nachshon Cohen, Rachid Guerraoui, and Igor Zablotchi. The Inherent Cost of Remembering Consistently. In Proceedings of the 30th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2018, 2018. Google Scholar
  14. Jeremy Condit, Edmund B. Nightingale, Christopher Frost, Engin Ipek, Benjamin C. Lee, Doug Burger, and Derrick Coetzee. Better I/O through byte-addressable, persistent memory. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles 2009, SOSP 2009, pages 133-146, 2009. Google Scholar
  15. Tudor David, Aleksandar Dragojevic, Rachid Guerraoui, and Igor Zablotchi. Log-Free Concurrent Data Structures. In 2018 USENIX Annual Technical Conference, USENIX ATC 2018, 2018. Google Scholar
  16. Tudor David, Rachid Guerraoui, and Vasileios Trigonakis. Everything you always wanted to know about synchronization but were afraid to ask. In ACM SIGOPS 24th Symposium on Operating Systems Principles, SOSP '13, Farmington, PA, USA, November 3-6, 2013, pages 33-48, 2013. Google Scholar
  17. David Detlefs, Christine H. Flood, Alex Garthwaite, Paul Martin, Nir Shavit, and Guy L. Steele, Jr. Even Better DCAS-Based Concurrent Deques. In 14th International Conference on Distributed Computing, pages 59-73, 2000. Google Scholar
  18. Dave Dice, Maurice Herlihy, and Alex Kogan. Fast non-intrusive memory reclamation for highly-concurrent data structures. In Proceedings of the 2016 ACM SIGPLAN International Symposium on Memory Management, pages 36-45, 2016. Google Scholar
  19. Simon Doherty, David L. Detlefs, Lindsay Groves, Christine H. Flood, Victor Luchangco, Paul A. Martin, Mark Moir, Nir Shavit, and Guy L. Steele, Jr. DCAS is Not a Silver Bullet for Nonblocking Algorithm Design. In Proceedings of the Sixteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, pages 216-224, 2004. Google Scholar
  20. Steven D. Feldman, Pierre LaBorde, and Damian Dechev. A Wait-Free Multi-Word Compare-and-Swap Operation. International Journal of Parallel Programming, 43(4):572-596, 2015. Google Scholar
  21. Keir Fraser. Practical lock-freedom. PhD thesis, University of Cambridge, UK, 2004. Google Scholar
  22. Michal Friedman, Maurice Herlihy, Virendra J. Marathe, and Erez Petrank. A persistent lock-free queue for non-volatile memory. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2018, pages 28-40, 2018. Google Scholar
  23. Michael Greenwald. Non-blocking synchronization and system design. ph.d. thesis, stanford university, 1999. Google Scholar
  24. Michael Greenwald. Two-handed emulation: how to build non-blocking implementation of complex data-structures using DCAS. In 21st Annual ACM Symposium on Principles of Distributed Computing, pages 260-269, 2002. Google Scholar
  25. Rachid Guerraoui, Alex Kogan, Virendra J. Marathe, and Igor Zablotchi. Efficient multi-word compare and swap. ArXiv preprint arXiv:2008.02527, 2020. URL: https://arxiv.org/abs/2008.02527.
  26. Rachid Guerraoui and Ron R. Levy. Robust Emulations of Shared Memory in a Crash-Recovery Model. In 24th International Conference on Distributed Computing Systems (ICDCS 2004), pages 400-407, 2004. Google Scholar
  27. Phuong Hoai Ha and Philippas Tsigas. Reactive Multi-Word Synchronization for Multiprocessors. In 12th International Conference on Parallel Architectures and Compilation Techniques (PACT 2003), pages 184-193, 2003. Google Scholar
  28. Phuong Hoai Ha and Philippas Tsigas. Reactive Multi-word Synchronization for Multiprocessors. J. Instruction-Level Parallelism, 6, 2004. Google Scholar
  29. Tim Harris, James R. Larus, and Ravi Rajwar. Transactional Memory: 2nd Edition. Morgan & Claypool, 2010. Google Scholar
  30. Timothy L. Harris and Keir Fraser. Language support for lightweight transactions. In Proceedings of the 2003 ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications, pages 388-402, 2003. Google Scholar
  31. Timothy L. Harris, Keir Fraser, and Ian A. Pratt. A Practical Multi-word Compare-and-Swap Operation. In 16th International Conference on Distributed Computing, pages 265-279, 2002. Google Scholar
  32. Maurice Herlihy. A methodology for implementing highly concurrent data objects. ACM Transactions on Programming Languages and Systems, 15(5):745-770, November 1993. Google Scholar
  33. Maurice Herlihy, Victor Luchangco, Mark Moir, and William N. Scherer, III. Software Transactional Memory for Dynamic-sized Data Structures. In Proceedings of the Twenty-second Annual Symposium on Principles of Distributed Computing, pages 92-101, 2003. Google Scholar
  34. Maurice Herlihy and J. Eliot B. Moss. Transactional Memory: Architectural Support for Lock-free Data Structures. In 20th Annual International Symposium on Computer Architecture, pages 289-300, 1993. Google Scholar
  35. Maurice Herlihy and Jeannette M Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3):463-492, 1990. Google Scholar
  36. Intel. Intelsuperscriptregistered 64 and IA-32 Architectures Software Developer’s Manual Combined. https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf, 2018.
  37. Amos Israeli and Lihu Rappoport. Disjoint-Access-Parallel Implementations of Strong Shared Memory Primitives. In Proceedings of the Thirteenth Annual ACM Symposium on Principles of Distributed Computing, pages 151-160, 1994. Google Scholar
  38. Joseph Izraelevitz, Hammurabi Mendes, and Michael L. Scott. Linearizability of persistent memory objects under a full-system-crash failure model. In Distributed Computing - 30th International Symposium, DISC 2016, pages 313-327, 2016. Google Scholar
  39. Justin J. Levandoski, David B. Lomet, and Sudipta Sengupta. The Bw-Tree: A B-tree for New Hardware Platforms. In Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE 2013), pages 302-313, 2013. Google Scholar
  40. Virendra J. Marathe and Mark Moir. Toward high performance nonblocking software transactional memory. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2008, Salt Lake City, UT, USA, February 20-23, 2008, pages 227-236. ACM, 2008. Google Scholar
  41. Paul E. McKenney. Is parallel programming hard, and, if so, what can you do about it?, 2017. Google Scholar
  42. Paul E. McKenney and John D. Slingwine. Read-Copy Update: Using Execution History to Solve Concurrency Problems. In Parallel and Distributed Computing and Systems, 1998. Google Scholar
  43. Mark Moir. Transparent Support for Wait-Free Transactions. In 11th International Workshop on Distributed Algorithms, pages 305-319, 1997. Google Scholar
  44. Aravind Natarajan and Neeraj Mittal. Fast Concurrent Lock-free Binary Search Trees. In Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 317-328, 2014. Google Scholar
  45. The OpenMP API specification for parallel programming. https://www.openmp.org/, 2019.
  46. Matej Pavlovic, Alex Kogan, Virendra J. Marathe, and Tim Harris. Persistent Multi-Word Compare-and-Swap. In ACM Synposium on Principles of Distributed Computing, 2018. Google Scholar
  47. Steven Pelley, Peter M. Chen, and Thomas F. Wenisch. Memory persistency: Semantics for byte-addressable nonvolatile memory technologies. IEEE Micro, 35(3):125-131, 2015. Google Scholar
  48. Benchmarking framework for index structures on persistent memory. https://github.com/wangtzh/pibench, 2019.
  49. Persistent multi-word compare-and-swap (PMwCAS) for NVRAM. https://github.com/microsoft/pmwcas, 2019.
  50. Manuel Pöter and Jesper Larsson Träff. Stamp-it, amortized constant-time memory reclamation in comparison to five other schemes. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2018, pages 413-414, 2018. Google Scholar
  51. Nir Shavit and Dan Touitou. Software Transactional Memory. In 14th Annual ACM Symposium on Principles of Distributed Computing, pages 204-213, 1995. Google Scholar
  52. Håkan Sundell. Wait-Free Multi-Word Compare-and-Swap Using Greedy Helping and Grabbing. International Journal of Parallel Programming, 39(6):694-716, 2011. Google Scholar
  53. Tianzheng Wang, Justin Levandoski, and Per-Ake Larson. Easy Lock-Free Indexing in Non-Volatile Memory. In 34th IEEE International Conference on Data Engineering, 2018. Google Scholar
  54. Haosen Wen, Joseph Izraelevitz, Wentao Cai, H. Alan Beadle, and Michael L. Scott. Interval-based memory reclamation. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 1-13, 2018. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail