Monotonically Relaxing Concurrent Data-Structure Semantics for Increasing Performance: An Efficient 2D Design Framework

Authors Adones Rukundo, Aras Atalar, Philippas Tsigas



PDF
Thumbnail PDF

File

LIPIcs.DISC.2019.31.pdf
  • Filesize: 1.76 MB
  • 15 pages

Document Identifiers

Author Details

Adones Rukundo
  • Chalmers University of Technology Department of Computer Science and Engineering, Sweden
Aras Atalar
  • Chalmers University of Technology Department of Computer Science and Engineering, Sweden
Philippas Tsigas
  • Chalmers University of Technology Department of Computer Science and Engineering, Sweden

Cite AsGet BibTex

Adones Rukundo, Aras Atalar, and Philippas Tsigas. Monotonically Relaxing Concurrent Data-Structure Semantics for Increasing Performance: An Efficient 2D Design Framework. In 33rd International Symposium on Distributed Computing (DISC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 146, pp. 31:1-31:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.DISC.2019.31

Abstract

There has been a significant amount of work in the literature proposing semantic relaxation of concurrent data structures for improving scalability and performance. By relaxing the semantics of a data structure, a bigger design space, that allows weaker synchronization and more useful parallelism, is unveiled. Investigating new data structure designs, capable of trading semantics for achieving better performance in a monotonic way, is a major challenge in the area. We algorithmically address this challenge in this paper. We present an efficient, lock-free, concurrent data structure design framework for out-of-order semantic relaxation. We introduce a new two dimensional algorithmic design, that uses multiple instances of a given data structure. The first dimension of our design is the number of data structure instances operations are spread to, in order to benefit from parallelism through disjoint memory access; the second dimension is the number of consecutive operations that try to use the same data structure instance in order to benefit from data locality. Our design can flexibly explore this two-dimensional space to achieve the property of monotonically relaxing concurrent data structure semantics for better performance within a tight deterministic relaxation bound, as we prove in the paper. We show how our framework can instantiate lock-free out-of-order queues, stacks, counters and dequeues. We provide implementations of these relaxed data structures and evaluate their performance and behaviour on two parallel architectures. Experimental evaluation shows that our two-dimensional design significantly outperforms the respected previous proposed designs with respect to scalability and performance. Moreover, our design increases performance monotonically as relaxation increases.

Subject Classification

ACM Subject Classification
  • Theory of computation → Data structures design and analysis
  • Theory of computation → Design and analysis of algorithms
  • Theory of computation → Concurrency
  • Theory of computation → Concurrent algorithms
  • Computing methodologies → Concurrent algorithms
Keywords
  • Lock Free
  • Concurrency
  • Semantics Relaxation
  • Data Structures

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Yehuda Afek, Guy Korland, and Eitan Yanovsky. Quasi-linearizability: Relaxed consistency for improved concurrency. In International Conference on Principles of Distributed Systems, pages 395-410. Springer, 2010. Google Scholar
  2. Dan Alistarh, Trevor Brown, Justin Kopinsky, Jerry Zheng Li, and Giorgi Nadiradze. Distributionally Linearizable Data Structures. CoRR, abs/1804.01018, 2018. URL: http://arxiv.org/abs/1804.01018.
  3. Dan Alistarh, Justin Kopinsky, Jerry Li, and Giorgi Nadiradze. The Power of Choice in Priority Scheduling. In Proceedings of the ACM Symposium on Principles of Distributed Computing, PODC '17, pages 283-292, 2017. Google Scholar
  4. Dan Alistarh, Justin Kopinsky, Jerry Li, and Nir Shavit. The SprayList: A scalable relaxed priority queue. ACM SIGPLAN Notices, 50(8):11-20, 2015. Google Scholar
  5. Hagit Attiya, Rachid Guerraoui, Danny Hendler, Petr Kuznetsov, Maged M. Michael, and Martin Vechev. Laws of Order: Expensive Synchronization in Concurrent Algorithms Cannot Be Eliminated. SIGPLAN Not., 46(1):487-498, January 2011. Google Scholar
  6. Gal Bar-Nissan, Danny Hendler, and Adi Suissa. A Dynamic Elimination-Combining Stack Algorithm. CoRR, abs/1106.6304, 2011. URL: http://arxiv.org/abs/1106.6304.
  7. Tudor David, Rachid Guerraoui, and Vasileios Trigonakis. Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, pages 33-48, 2013. Google Scholar
  8. Tudor David, Rachid Guerraoui, and Vasileios Trigonakis. Asynchronized Concurrency: The Secret to Scaling Concurrent Search Data Structures. SIGARCH Comput. Archit. News, 43(1):631-644, March 2015. Google Scholar
  9. Edsger W. Dijkstra. The Structure of "THE"-multiprogramming System. Commun. ACM, 11(5):341-346, May 1968. Google Scholar
  10. EW Dijkstra. Solution of a problem in concurrent programming control. Communications of the ACM, 8(9):569, 1965. Google Scholar
  11. Mike Dodds, Andreas Haas, and Christoph M. Kirsch. A Scalable, Correct Time-Stamped Stack. SIGPLAN Not., 50(1):233-246, January 2015. Google Scholar
  12. Panagiota Fatourou and Nikolaos D. Kallimanis. Revisiting the Combining Synchronization Technique. In Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '12, pages 257-266, 2012. Google Scholar
  13. Elad Gidron, Idit Keidar, Dmitri Perelman, and Yonathan Perez. SALSA: scalable and low synchronization NUMA-aware algorithm for producer-consumer pools. In Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures, pages 151-160. ACM, 2012. Google Scholar
  14. Andreas Haas, Thomas A. Henzinger, Andreas Holzer, Christoph M. Kirsch, Michael Lippautz, Hannes Payer, Ali Sezgin, Ana Sokolova, and Helmut Veith. Local Linearizability for Concurrent Container-Type Data Structures. In 27th International Conference on Concurrency Theory, CONCUR 2016, August 23-26, 2016, Québec City, Canada, pages 6:1-6:15, 2016. URL: https://doi.org/10.4230/LIPIcs.CONCUR.2016.6.
  15. Andreas Haas, Michael Lippautz, Thomas A. Henzinger, Hannes Payer, Ana Sokolova, Christoph M. Kirsch, and Ali Sezgin. Distributed Queues in Shared Memory: Multicore Performance and Scalability Through Quantitative Relaxation. In Proceedings of the ACM International Conference on Computing Frontiers, CF '13, pages 17:1-17:9, 2013. Google Scholar
  16. Daniel Hackenberg, Daniel Molka, and Wolfgang E. Nagel. Comparing Cache Architectures and Coherency Protocols on x86-64 Multicore SMP Systems. In Proceedings of the 42Nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 42, pages 413-422, 2009. Google Scholar
  17. Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. Flat Combining and the Synchronization-parallelism Tradeoff. In Proceedings of the Twenty-second Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA '10, pages 355-364, 2010. Google Scholar
  18. Danny Hendler, Nir Shavit, and Lena Yerushalmi. A Scalable Lock-free Stack Algorithm. In Proceedings of the Sixteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA '04, pages 206-215, 2004. Google Scholar
  19. Thomas A. Henzinger, Christoph M. Kirsch, Hannes Payer, Ali Sezgin, and Ana Sokolova. Quantitative Relaxation of Concurrent Data Structures. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '13, pages 317-328, 2013. Google Scholar
  20. Maged M. Michael. CAS-Based Lock-Free Algorithm for Shared Deques. In Euro-Par 2003. Parallel Processing, 9th International Euro-Par Conference, Klagenfurt, Austria, August 26-29, 2003. Proceedings, pages 651-660, 2003. URL: https://doi.org/10.1007/978-3-540-45209-6_92.
  21. Maged M. Michael and Michael L. Scott. Simple, Fast, and Practical Non-blocking and Blocking Concurrent Queue Algorithms. In Proceedings of the Fifteenth Annual ACM Symposium on Principles of Distributed Computing, PODC '96, pages 267-275, 1996. Google Scholar
  22. Michael Mitzenmacher. The power of two choices in randomized load balancing. IEEE Transactions on Parallel and Distributed Systems, 12(10):1094-1104, 2001. Google Scholar
  23. Mark Moir, Daniel Nussbaum, Ori Shalev, and Nir Shavit. Using Elimination to Implement Scalable and Lock-free FIFO Queues. In Proceedings of the Seventeenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA '05, pages 253-262, 2005. Google Scholar
  24. Hamza Rihani, Peter Sanders, and Roman Dementiev. Brief announcement: Multiqueues: Simple relaxed concurrent priority queues. In Proceedings of the 27th ACM symposium on Parallelism in Algorithms and Architectures, pages 80-82. ACM, 2015. Google Scholar
  25. Adones Rukundo, Aras Atalar, and Philippas Tsigas. 2D-Stack: A scalable lock-free stack design that continuously relaxes semantics for better performance. Technical report, Chalmers University of Technology, 2018. URL: https://research.chalmers.se/publication/506089/file/506089_Fulltext.pdf.
  26. Adones Rukundo, Aras Atalar, and Philippas Tsigas. Brief Announcement: 2D-Stack - A Scalable Lock-Free Stack Design that Continuously Relaxes Semantics for Better Performance. In Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, PODC 2018, Egham, United Kingdom, July 23-27, 2018, pages 407-409, 2018. URL: https://dl.acm.org/citation.cfm?id=3212794.
  27. Adones Rukundo, Aras Atalar, and Philippas Tsigas. Monotonically relaxing concurrent data-structure semantics for performance: An efficient 2D design framework. CoRR, abs/1906.07105, 2019. URL: http://arxiv.org/abs/1906.07105.
  28. Hermann Schweizer, Maciej Besta, and Torsten Hoefler. Evaluating the Cost of Atomic Operations on Modern Architectures. In Proceedings of the 2015 International Conference on Parallel Architecture and Compilation (PACT), PACT '15, pages 445-456, 2015. Google Scholar
  29. Nir Shavit. Data Structures in the Multicore Age. Commun. ACM, 54(3):76-84, March 2011. Google Scholar
  30. Nir Shavit and Gadi Taubenfeld. The Computability of Relaxed Data Structures: Queues and Stacks As Examples. Distrib. Comput., 29(5):395-407, October 2016. Google Scholar
  31. Nir Shavit and Dan Touitou. Elimination Trees and the Construction of Pools and Stacks: Preliminary Version. In Proceedings of the Seventh Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA '95, pages 54-63, 1995. Google Scholar
  32. Nir Shavit and Asaph Zemach. Combining funnels: a dynamic approach to software combining. Journal of Parallel and Distributed Computing, 60(11):1355-1387, 2000. Google Scholar
  33. Edward Talmage and Jennifer L. Welch. Relaxed Data Types as Consistency Conditions, pages 142-156. Springer International Publishing, Cham, 2017. Google Scholar
  34. R.K. Treiber. Systems Programming: Coping with Parallelism. Research Report RJ. International Business Machines Incorporated, Thomas J. Watson Research Center, 1986. Google Scholar
  35. Martin Wimmer, Jakob Gruber, Jesper Larsson Träff, and Philippas Tsigas. The Lock-free k-LSM Relaxed Priority Queue. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, pages 277-278, 2015. Google Scholar
  36. Chaoran Yang and John Mellor-Crummey. A Wait-free Queue As Fast As Fetch-and-add. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '16, pages 16:1-16:13, 2016. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail