Fast Lean Erasure-Coded Atomic Memory Object

Konwar, Kishori M.; Prakash, N.; Médard, Muriel; Lynch, Nancy

doi:10.4230/LIPIcs.OPODIS.2019.12

Abstract

In this work, we propose FLECKS, an algorithm which implements atomic memory objects in a multi-writer multi-reader (MWMR) setting in asynchronous networks and server failures. FLECKS substantially reduces storage and communication costs over its replication-based counterparts by employing erasure-codes. FLECKS outperforms the previously proposed algorithms in terms of the metrics that to deliver good performance such as storage cost per object, communication cost a high fault-tolerance of clients and servers, guaranteed liveness of operation, and a given number of communication rounds per operation, etc. We provide proofs for liveness and atomicity properties of FLECKS and derive worst-case latency bounds for the operations. We implemented and deployed FLECKS in cloud-based clusters and demonstrate that FLECKS has substantially lower storage and bandwidth costs, and significantly lower latency of operations than the replication-based mechanisms.

Intel® Intelligent Storage Acceleration Library (Intel® ISA-L). https://software.intel.com/en-us/storage/ISA-L. [Online; accessed 23-August-2018].
libstatgrab. https://www.i-scream.org/libstatgrab/. [Online; accessed 23-August-2018].
ZeroMQ: Distributed Messaging. http://zeromq.org/. [Online; accessed 23-August-2018].
H. Attiya, A. Bar-Noy, and D. Dolev. Sharing Memory Robustly in Message Passing Systems. Journal of the ACM, 42(1):124-142, 1996.
Christian Cachin and Stefano Tessaro. Optimal Resilience for Erasure-Coded Byzantine Distributed Storage. In 2006 International Conference on Dependable Systems and Networks (DSN 2006), 25-28 June 2006, Philadelphia, Pennsylvania, USA, Proceedings, pages 115-124, Los Alamitos, CA, USA, 2006. IEEE Computer Society. URL: https://doi.org/10.1109/DSN.2006.56.
Viveck R. Cadambe, Nancy A. Lynch, Muriel Médard, and Peter M. Musial. A coded shared atomic memory algorithm for message passing architectures. Distributed Computing, 30(1):49-73, 2017.
Yu Lin Chen Chen, Shuai Mu, and Jinyang Li. Giza: Erasure coding objects across global data centers. In Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC ’17), pages 539-551, 2017.
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: Amazon’s Highly Available Key-value Store. In Proceedings of Twenty-first ACM SIGOPS Symposium on Operating Systems Principles, SOSP '07, pages 205-220, New York, NY, USA, 2007. ACM. URL: https://doi.org/10.1145/1294261.1294281.
A. G. Dimakis, K. Ramchandran, Y. Wu, and C. Suh. A survey on network codes for distributed storage. Proceedings of the IEEE, 99(3):476-489, 2011.
Partha Dutta, Rachid Guerraoui, and Ron R. Levy. Optimistic Erasure-Coded Distributed Storage. In DISC '08: Proceedings of the 22nd international symposium on Distributed Computing, pages 182-196, Berlin, Heidelberg, 2008. Springer-Verlag. URL: https://doi.org/10.1007/978-3-540-87779-0_13.
G.R. Goodson, J.J. Wylie, G.R. Ganger, and M.K. Reiter. In Dependable Systems and Networks, 2004 International Conference on. URL: https://doi.org/10.1109/DSN.2004.1311884.
James Hendricks, Gregory R Ganger, and Michael K Reiter. Low-overhead byzantine fault-tolerant storage. ACM SIGOPS Operating Systems Review, 41(6):73-86, 2007.
W. C. Huffman and V. Pless. Fundamentals of error-correcting codes. Cambridge university press, 2003.
K. M. Konwar, N. Prakash, E. Kantor, N. Lynch, M. Médard, and A. A. Schwarzmann. Storage-Optimized Data-Atomic Algorithms for Handling Erasures and Errors in Distributed Storage Systems. In 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 720-729, May 2016.
Kishori M Konwar, N Prakash, Nancy Lynch, and Muriel Médard. RADON: Repairable atomic data object in networks. In The International Conference on Distributed Systems (OPODIS), 2016.
Leslie Lamport. On Interprocess Communication, Part I: Basic Formalism. Distributed Computing, 1(2):77-85, 1986.
Leslie Lamport. Fast Paxos. Distributed Computing, 19:79-103, October 2006.
N Nicolaou, V Cadambe, N. Prakash, K.M. Konwar, M. Medard, and N Lynch. ARES: Adaptive, reconfigurable, erasure coded, atomic storage implementing a register in a dynamic distributed system. In International Conf. on Distributed Computing Systems (ICDCS), 2019.
A. Spiegelman, Y. Cassuto, G. Chockler, and I. Keidar. Space Bounds for Reliable Storage: Fundamental Limits of Coding. In Proceedings of the International Conference on Principles of Distributed Systems (OPODIS2015), 2015.
Heng Zhang, Mingkai Dong, and Haibo Chen. Efficient and Available In-memory KV-Store with Hybrid Erasure Coding and Replication. In 14th USENIX Conference on File and Storage Technologies (FAST 16), pages 167-180, 2016.

Fast Lean Erasure-Coded Atomic Memory Object

Authors Kishori M. Konwar, N. Prakash, Muriel Médard, Nancy Lynch

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message