Consistent Distributed Memory Services: Resilience and Efficiency (Invited Paper)

Authors Theophanis Hadjistasi, Alexander A. Schwarzmann



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2018.1.pdf
  • Filesize: 0.5 MB
  • 19 pages

Document Identifiers

Author Details

Theophanis Hadjistasi
  • University of Connecticut, Storrs CT, USA
Alexander A. Schwarzmann
  • University of Connecticut, Storrs CT, USA

Cite AsGet BibTex

Theophanis Hadjistasi and Alexander A. Schwarzmann. Consistent Distributed Memory Services: Resilience and Efficiency (Invited Paper). In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 1:1-1:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.ICALP.2018.1

Abstract

Reading, 'Riting, and 'Rithmetic, the three R's underlying much of human intellectual activity, not surprisingly, also stand as a venerable foundation of modern computing technology. Indeed, both the Turing machine and von Neumann machine models operate by reading, writing, and computing, and all practical uniprocessor implementations are based on performing activities structured in terms of the three R's. With the advance of networking technology, communication became an additional major systemic activity. However, at a high level of abstraction, it is apparently still more natural to think in terms of reading, writing, and computing. While it is hard to imagine distributed systems - such as those implementing the World-Wide Web - without communication, we often imagine browser-based applications that operate by retrieving (i.e., reading) data, performing computation, and storing (i.e., writing) the results. In this article, we deal with the storage of shared readable and writable data in distributed systems that are subject to perturbations in the underlying distributed platforms composed of computers and networks that interconnect them. The perturbations may include permanent failures (or crashes) of individual computers, transient failures, and delays in the communication medium. The focus of this paper is on the implementations of distributed atomic memory services. Atomicity is a venerable notion of consistency, introduced in 1979 by Lamport [Lamport, 1979]. To this day atomicity remains the most natural type of consistency because it provides an illusion of equivalence with the serial object type that software designers expect. We define the overall setting, models of computation, definition of atomic consistency, and measures of efficiency. We then present algorithms for single-writer settings in the static models. Then we move to presenting algorithms for multi-writer settings. For both static settings we discuss design issues, correctness, efficiency, and trade-offs. Lastly we survey the implementation issues in dynamic settings, where the universe of participants may completely change over time. Here the expectation is that solutions are found by integrating static algorithms with a reconfiguration framework so that during periods of relative stability one benefits from the efficiency of static algorithms, and where during the more turbulent times performance degrades gracefully when reconfigurations are needed. We describe the most important approaches and provide examples.

Subject Classification

ACM Subject Classification
  • Networks → Cloud computing
  • General and reference → Surveys and overviews
Keywords
  • Atomicity
  • shared-memory
  • read/write objects
  • fault-tolerance
  • latency

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Marcos K. Aguilera, Idit Keidar, Dahlia Malkhi, and Alexander Shraer. Dynamic atomic storage without consensus. J. ACM, 58:7:1-7:32, April 2011. URL: http://dx.doi.org/10.1145/1944345.1944348.
  2. Marcos K. Aguilera, Idit Keidary, Dahlia Malkhi, Jean-Philippe Martin, and Alexander Shraery. Reconfiguring replicated atomic storage: A tutorial. Bulletin of the EATCS, 102:84-081, 2010. Google Scholar
  3. Antonio Fernández Anta, Theophanis Hadjistasi, and Nicolas C. Nicolaou. Computationally light "multi-speed" atomic memory. In 20th International Conference on Principles of Distributed Systems, OPODIS 2016, December 13-16, 2016, Madrid, Spain, pages 29:1-29:17, 2016. URL: http://dx.doi.org/10.4230/LIPIcs.OPODIS.2016.29.
  4. Antonio Fernández Anta, Nicolas C. Nicolaou, and Alexandru Popa. Making "fast" atomic operations computationally tractable. In 19th International Conference on Principles of Distributed Systems, OPODIS 2015, December 14-17, 2015, Rennes, France, pages 19:1-19:16, 2015. URL: http://dx.doi.org/10.4230/LIPIcs.OPODIS.2015.19.
  5. H. Attiya, A. Bar-Noy, and D. Dolev. Sharing memory robustly in message passing systems. Journal of the ACM, 42(1):124-142, 1996. Google Scholar
  6. Hagit Attiya, Amotz Bar-Noy, and Danny Dolev. Sharing memory robustly in message-passing systems. J. ACM, 42(1):124-142, 1995. URL: http://dx.doi.org/10.1145/200836.200869.
  7. Hagit Attiya and Jennifer L. Welch. Sequential consistency versus linearizability. ACM Trans. Comput. Syst., 12(2):91-122, 1994. URL: http://dx.doi.org/10.1145/176575.176576.
  8. Ken Birman. A history of the virtual synchrony replication model. In Replication: Theory and Practice, volume 5959 of Lecture Notes in Computer Science, pages 91-120, 2010. Google Scholar
  9. Ken Birman, Dahlia Malkhi, and Robbert Van Renesse. Virtually synchronous methodology for dynamic service replication. Technical report, MSR-TR-2010-151, Microsoft Research, 2010. Google Scholar
  10. Brad Calder, Ju Wang, Aaron Ogus, Niranjan Nilakantan, Arild Skjolsvold, Sam McKelvie, Yikang Xu, Shashwat Srivastav, Jiesheng Wu, Huseyin Simitci, Jaidev Haridas, Chakravarthy Uddaraju, Hemal Khatri, Andrew Edwards, Vaman Bedekar, Shane Mainali, Rafay Abbasi, Arpit Agarwal, Mian Fahim ul Haq, Muhammad Ikram ul Haq, Deepali Bhardwaj, Sowmya Dayanand, Anitha Adusumilli, Marvin McNett, Sriram Sankaran, Kavitha Manivannan, and Leonidas Rigas. Windows azure storage: a highly available cloud storage service with strong consistency. In Ted Wobber and Peter Druschel, editors, Proceedings of the 23rd ACM Symposium on Operating Systems Principles 2011, SOSP 2011, Cascais, Portugal, October 23-26, 2011, pages 143-157. ACM, 2011. URL: http://dx.doi.org/10.1145/2043556.2043571.
  11. Tushar Deepak Chandra, Vassos Hadzilacos, and Sam Toueg. An algorithm for replicated objects with efficient reads. In Proceedings of the 2016 ACM Symposium on Principles of Distributed Computing, PODC 2016, Chicago, IL, USA, July 25-28, 2016, pages 325-334, 2016. URL: http://dx.doi.org/10.1145/2933057.2933111.
  12. Gregory Chockler, Seth Gilbert, Vincent Gramoli, Peter M. Musial, and Alexander A. Shvartsman. Reconfigurable distributed storage for dynamic networks. Journal of Parallel and Distributed Computing, 69(1):100-116, 2009. Google Scholar
  13. James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson C. Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford. Spanner: Google’s globally-distributed database. In 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, Hollywood, CA, USA, October 8-10, 2012, pages 261-264, 2012. URL: https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett.
  14. Roberto De Prisco, Alan Fekete, Nancy A. Lynch, and Alexander A. Shvartsman. A dynamic primary configuration group communication service. In Proc. of the 13th Int-l Symposium on Distributed Computing, pages 64-78. Springer-Verlag, 1999. URL: http://dl.acm.org/citation.cfm?id=645956.675955.
  15. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev., 41(6):205-220, 2007. URL: http://dx.doi.org/10.1145/1323293.1294281.
  16. S. Dolev, S. Gilbert, N. Lynch, A. Shvartsman, and J. Welch. Geoquorums: Implementing atomic memory in mobile ad hoc networks. In Proceedings of 17th International Symposium on Distributed Computing (DISC), 2003. Google Scholar
  17. Partha Dutta, Rachid Guerraoui, Ron R. Levy, and Arindam Chakraborty. How fast can a distributed atomic read be? In Proceedings of the 23rd ACM symposium on Principles of Distributed Computing (PODC), pages 236-245, 2004. Google Scholar
  18. Burkhard Englert, Chryssis Georgiou, Peter M. Musial, Nicolas Nicolaou, and Alexander A. Shvartsman. On the efficiency of atomic multi-reader, multi-writer distributed memory. In Proceedings 13th International Conference On Principle Of DIstributed Systems (OPODIS 09), pages 240-254, 2009. Google Scholar
  19. Alan Fekete, Nancy A. Lynch, and Alexander A. Shvartsman. Specifying and using a partitionable group communication service. ACM Trans. Comput. Syst., 19(2):171-216, 2001. URL: http://dx.doi.org/10.1145/377769.377776.
  20. Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. Impossibility of distributed consensus with one faulty process. Journal of ACM, 32(2):374-382, 1985. URL: http://dx.doi.org/10.1145/3149.214121.
  21. Chryssis Georgiou, Theophanis Hadjistasi, Nicolas C. Nicolaou, and Alexander A. Schwarzmann. Unleeshing and speeding up readers in atomic object implementations. In Networked Systems - 6th International Conference, NETYS 2018, Essaouria, Morocco, May 9-11, 2018, Proceedings, 2018. Google Scholar
  22. Chryssis Georgiou, Peter M. Musial, and Alex A. Shvartsman. Long-lived RAMBO: Trading knowledge for communication. Theoretical Computer Science, 383(1):59-85, 2007. Google Scholar
  23. Chryssis Georgiou, Peter M. Musial, and Alexander A. Shvartsman. Developing a consistent domain-oriented distributed object service. IEEE Transactions of Parallel and Distributed Systems (TPDS), 20(11):1567-1585, 2009. A preliminary version of this work appeared in the proceedings of the 4th IEEE International Symposium on Network Computing and Applications (NCA'05). Google Scholar
  24. Chryssis Georgiou, Nicolas Nicolaou, Alexander Russel, and Alexander A. Shvartsman. Towards feasible implementations of low-latency multi-writer atomic registers. In 10th Annual IEEE International Symposium on Network Computing and Applications, 2011. Google Scholar
  25. Chryssis Georgiou, Nicolas C. Nicolaou, and Alexander A. Shvartsman. On the robustness of (semi) fast quorum-based implementations of atomic shared memory. In DISC '08: Proceedings of the 22nd international symposium on Distributed Computing, pages 289-304, Berlin, Heidelberg, 2008. Springer-Verlag. URL: http://dx.doi.org/10.1007/978-3-540-87779-0_20.
  26. Chryssis Georgiou, Nicolas C. Nicolaou, and Alexander A. Shvartsman. Fault-tolerant semifast implementations of atomic read/write registers. Journal of Parallel and Distributed Computing, 69(1):62-79, 2009. URL: http://dx.doi.org/10.1016/j.jpdc.2008.05.004.
  27. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles, SOSP '03, pages 29-43, New York, NY, USA, 2003. ACM. URL: http://dx.doi.org/10.1145/945445.945450.
  28. S. Gilbert, N. Lynch, and A. Shvartsman. RAMBO: A robust, reconfigurable atomic memory service for dynamic networks. Distributed Computing, 23(4):225-272, December 2010. Google Scholar
  29. Vincent Gramoli, Peter M. Musial, and Alexander A. Shvartsman. Operation liveness and gossip management in a dynamic distributed atomic data service. In Proceedings of the ISCA 18th International Conference on Parallel and Distributed Computing Systems, September 12-14, 2005 Imperial Palace Hotel, Las Vegas, Nevada, US, pages 206-211, 2005. Google Scholar
  30. Theophanis Hadjistasi, Nicolas C. Nicolaou, and Alexander A. Schwarzmann. Oh-ram! one and a half round atomic memory. In Networked Systems - 5th International Conference, NETYS 2017, Marrakech, Morocco, May 17-19, 2017, Proceedings, pages 117-132, 2017. URL: http://dx.doi.org/10.1007/978-3-319-59647-1_10.
  31. Maurice Herlihy. Wait-free synchronization. ACM Trans. Program. Lang. Syst., 13(1):124-149, 1991. URL: http://dx.doi.org/10.1145/114005.102808.
  32. Maurice P. Herlihy and Jeannette M. Wing. Linearizability: a correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems (TOPLAS), 12(3):463-492, 1990. URL: http://dx.doi.org/10.1145/78969.78972.
  33. Kishori M. Konwar, Peter M. Musial, Nicolas C. Nicolaou, and Alexander A. Shvartsman. Implementing atomic data through indirect learning in dynamic networks. In Sixth IEEE International Symposium on Network Computing and Applications (NCA 2007), 12 - 14 July 2007, Cambridge, MA, USA, pages 223-230, 2007. URL: http://dx.doi.org/10.1109/NCA.2007.30.
  34. Avinash Lakshman and Prashant Malik. Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev., 44(2):35-40, 2010. URL: http://dx.doi.org/10.1145/1773912.1773922.
  35. L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess progranm. IEEE Trans. Comput., 28(9):690-691, 1979. URL: http://dx.doi.org/10.1109/TC.1979.1675439.
  36. Leslie Lamport. The part-time parliament. ACM Transactions on Computer Systems, 16(2):133-169, 1998. URL: http://dx.doi.org/10.1145/279227.279229.
  37. Barbara Liskov. The power of abstraction. In Nancy A. Lynch and Alexander A. Shvartsman, editors, Distributed Computing, 24th Int-l Symposium, DISC 2010 Proc., volume 6343 of LNCS. Springer, 2010. URL: http://dx.doi.org/10.1007/978-3-642-15763-9.
  38. Michael C. Loui and Hosame H. Abu-Amara. Memory requirements for agreement among unreliable asynchronous processes. In Franco P. Preparata, editor, Parallel and Distributed Computing, volume 4 of Advances in Computing Research, pages 163-183. JAI Press, Greenwich, Conn., 1987. Google Scholar
  39. N.A. Lynch. Distributed Algorithms. Morgan Kaufmann Publishers, 1996. Google Scholar
  40. Nancy A. Lynch and Alexander A. Shvartsman. Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts. In Proceedings of Symposium on Fault-Tolerant Computing, pages 272-281, 1997. Google Scholar
  41. Achour Mostéfaoui and Michel Raynal. Time-efficient read/write register in crash-prone asynchronous message-passing systems. In Networked Systems - 4th International Conference, NETYS 2016, Marrakech, Morocco, May 18-20, 2016, Revised Selected Papers, pages 250-265, 2016. URL: http://dx.doi.org/10.1007/978-3-319-46140-3_21.
  42. Peter M. Musial, Nicolas C. Nicolaou, and Alexander A. Shvartsman. Implementing distributed shared memory for dynamic networks. Commun. ACM, 57(6):88-98, 2014. URL: http://dx.doi.org/10.1145/2500874.
  43. René Peeters. The maximum edge biclique problem is np-complete. Discrete Applied Mathematics, 131(3):651-654, 2003. URL: http://dx.doi.org/10.1016/S0166-218X(03)00333-0.
  44. Marko Vukolic. Quorum Systems: With Applications to Storage and Consensus. Synthesis Lectures on Distributed Computing Theory. Morgan & Claypool Publishers, 2012. URL: http://dx.doi.org/10.2200/S00402ED1V01Y201202DCT009.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail