RADON: Repairable Atomic Data Object in Networks

Konwar, Kishori M.; Prakash, N.; Lynch, Nancy A.; Médard, Muriel

doi:10.4230/LIPIcs.OPODIS.2016.28

File

Author Details

Kishori M. Konwar

N. Prakash

Nancy A. Lynch

Muriel Médard

Cite AsGet BibTex

Kishori M. Konwar, N. Prakash, Nancy A. Lynch, and Muriel Médard. RADON: Repairable Atomic Data Object in Networks. In 20th International Conference on Principles of Distributed Systems (OPODIS 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 70, pp. 28:1-28:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)
https://doi.org/10.4230/LIPIcs.OPODIS.2016.28

Abstract

Erasure codes offer an efficient way to decrease storage and communication costs while implementing atomic memory service in asynchronous distributed storage systems. In this paper, we provide erasure-code-based algorithms having the additional ability to perform background repair of crashed nodes. A repair operation of a node in the crashed state is triggered externally, and is carried out by the concerned node via message exchanges with other active nodes in the system. Upon completion of repair, the node re-enters active state, and resumes participation in ongoing and future read, write, and repair operations. To guarantee liveness and atomicity simultaneously, existing works assume either the presence of nodes with stable storage, or presence of nodes that never crash during the execution. We demand neither of these; instead we consider a natural, yet practical network stability condition N1 that only restricts the number of nodes in the crashed/repair state during broadcast of any message. We present an erasure-code based algorithm RADON_{C} that is always live, and guarantees atomicity as long as condition N1 holds. In situations when the number of concurrent writes is limited, RADON_{C} has significantly improved storage and communication cost over a replication-based algorithm RADON_{R}, which also works under N1. We further show how a slightly stronger network stability condition N2 can be used to construct algorithms that never violate atomicity. The guarantee of atomicity comes at the expense of having an additional phase during the read and write operations.

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

M. K. Aguilera, R. Janakiraman, and L. Xu. Using erasure codes efficiently for storage in a distributed system. In Proceedings of International Conference on Dependable Systems and Networks (DSN), pages 336-345, 2005.
M. K. Aguilera, I. Keidar, D. Malkhi, J. P. Martin, and A. Shraery. Reconfiguring replicated atomic storage: A tutorial. Bulletin of the EATCS, 102:84-081, 2010.
M. K. Aguilera, I. Keidar, D. Malkhi, and A. Shraer. Dynamic atomic storage without consensus. Journal of the ACM, pages 7:1-7:32, 2011.
H. Attiya, A. Bar-Noy, and D. Dolev. Sharing memory robustly in message passing systems. Journal of the ACM, 42(1):124-142, 1996.
H. Attiya, H. C. Chung, F. Ellen, S. Kumar, and J. L. Welch. Simulating a shared register in an asynchronous system that never stops changing - (extended abstract). In Distributed Computing - 29th International Symposium, DISC 2015, Tokyo, Japan, October 7-9, 2015, Proceedings, pages 75-91, 2015.
R. Baldoni, S. Bonomi, A. M. Kermarrec, and M. Raynal. Implementing a register in a dynamic distributed system. In Distributed Computing Systems, 2009. ICDCS'09. 29th IEEE International Conference on, pages 639-647, June 2009.
C. Cachin and S. Tessaro. Optimal resilience for erasure-coded byzantine distributed storage. In Proceedings of International Conference on Dependable Systems and Networks (DSN), pages 115-124, 2006.
V. R. Cadambe, N. A. Lynch, M. Médard, and P. M. Musial. A coded shared atomic memory algorithm for message passing architectures. In Proceedings of 13th IEEE International Symposium on Network Computing and Applications (NCA), pages 253-260, 2014.
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst., 26(2):4:1-4:26, jun 2008.
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon’s highly available key-value store. In Proceedings of Twenty-first ACM SIGOPS Symposium on Operating Systems Principles, SOSP'07, pages 205-220, New York, NY, USA, 2007. ACM.
A. G. Dimakis, K. Ramchandran, Y. Wu, and C. Suh. A survey on network codes for distributed storage. Proceedings of the IEEE, 99(3):476-489, 2011.
D. Dobre, G. Karame, W. Li, M. Majuntke, N. Suri, and M. Vukolić. Powerstore: proofs of writing for efficient and robust storage. In Proceedings of the 2013 ACM SIGSAC conference on Computer &communications security, pages 285-298, 2013.
P. Dutta, R. Guerraoui, and R. R. Levy. Optimistic erasure-coded distributed storage. In Proceedings of the 22nd international symposium on Distributed Computing (DISC), pages 182-196, Berlin, Heidelberg, 2008.
R. Fan and N. Lynch. Efficient replication of large data objects. In Distributed algorithms, Lecture Notes in Computer Science, pages 75-91, 2003.
R. Guerraoui, R. R. Levy, B. Pochon, and J. Pugh. The collective memory of amnesic processes. ACM Trans. Algorithms, 4(1):1-31, 2008.
J. Hendricks, G. R. Ganger, and M. K. Reiter. Low-overhead byzantine fault-tolerant storage. ACM SIGOPS Operating Systems Review, 41(6):73-86, 2007.
C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, and S. Yekhanin. Erasure coding in windows azure storage. In Proc. USENIX Annual Technical Conference (ATC), pages 15-26, 2012.
W. C. Huffman and V. Pless. Fundamentals of error-correcting codes. Cambridge university press, 2003.
K. M. Konwar, N. Prakash, E. Kantor, N. Lynch, M. Medard, and A. A. Schwarzmann. Storage-optimized data-atomic algorithms for handling erasures 124 and errors in distributed storage systems. In 30th IEEE International Parallel &Distributed Processing Symposium (IPDPS), 2016.
K. M. Konwar, N. Prakash, M. Medard, and N. Lynch. RADON: Repairable atomic data object in networks. CoRR, abs/1605.05717, 2016.
L. Lamport. On interprocess communication. Distributed computing, 1(2):86-101, 1986.
N. Lynch and A. A. Shvartsman. RAMBO: A reconfigurable atomic memory service for dynamic networks. In Proceedings of 16th International Symposium on Distributed Computing (DISC), pages 173-190, 2002.
N. A. Lynch. Distributed Algorithms. Morgan Kaufmann Publishers, 1996.
Peter Musial, Nicolas Nicolaou, and Alexander A. Shvartsman. Implementing distributed shared memory for dynamic networks. Communications of the ACM, 57(6):88-98, 2014.
K. V. Rashmi, P. Nakkiran, J. Wang, N. B. Shah, and K. Ramchandran. Having your cake and eating it too: Jointly optimal erasure codes for i/o, storage, and network-bandwidth. In 13th USENIX Conference on File and Storage Technologies (FAST), pages 81-94, 2015.
I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of the society for industrial and applied mathematics, 8(2):300-304, 1960.
M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur. XORing elephants: novel erasure codes for big data. In Proceedings of the 39th international conference on Very Large Data Bases, pages 325-336, 2013.
C. Shao, J. L. Welch, E. Pierce, and H. Lee. Multiwriter consistency conditions for shared memory registers. SIAM Journal on Computing, 40(1):28-62, 2011.
A. Spiegelman, Y. Cassuto, G. Chockler, and I. Keidar. Space Bounds for Reliable Storage: Fundamental Limits of Coding. In Proceedings of the International Conference on Principles of Distributed Systems (OPODIS2015), 2015.
A. Spiegelman and I. Keidar. On liveness of dynamic storage. CoRR, abs/1507.07086, 2015. URL: http://arxiv.org/abs/1507.07086.
A. Spiegelman, I. Keidar, and D. Malkhi. Dynamic reconfiguration: A tutorial. OPODIS 2015, 2015.

RADON: Repairable Atomic Data Object in Networks

Authors Kishori M. Konwar, N. Prakash, Nancy A. Lynch, Muriel Médard

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Keywords

Metrics

References

Thanks for your feedback!

Could not send message