Hardening Cassandra Against Byzantine Failures

Authors Roy Friedman, Roni Licher



PDF
Thumbnail PDF

File

LIPIcs.OPODIS.2017.27.pdf
  • Filesize: 1.91 MB
  • 20 pages

Document Identifiers

Author Details

Roy Friedman
Roni Licher

Cite AsGet BibTex

Roy Friedman and Roni Licher. Hardening Cassandra Against Byzantine Failures. In 21st International Conference on Principles of Distributed Systems (OPODIS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 95, pp. 27:1-27:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.OPODIS.2017.27

Abstract

Cassandra is one of the most widely used distributed data stores. In this work, we analyze Cassandra’s vulnerabilities when facing Byzantine failures and propose protocols for hardening Cassandra against them. We examine several alternative design choices and compare between them both qualitatively and empirically by using the Yahoo! Cloud Serving Benchmark (YCSB) performance benchmark. Some of our proposals include novel combinations of quorum access protocols with MAC signatures arrays and elliptic curve public key cryptography so that in the normal data path, there are no public key verifications and only a single relatively cheap elliptic curve signature made by the client. Yet, these enable data recovery and authentication despite Byzantine failures and across membership configuration changes. In the experiments, we demonstrate that our best design alternative obtains roughly half the performance of plain (non-Byzantine) Cassandra.
Keywords
  • Cassandra
  • Byzantine Fault Tolerance
  • Distributed Storage

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Riak. URL: http://basho.com/products/riak-kv/.
  2. Marcos K Aguilera and Ram Swaminathan. Remote storage with byzantine servers. In Proc. of ACM SPAA, pages 280-289, 2009. Google Scholar
  3. Yair Amir, Brian Coan, Jonathan Kirsch, and John Lane. Prime: Byzantine replication under attack. IEEE Trans. on Dependable and Secure Computing, 8(4):564-577, 2011. Google Scholar
  4. Leonardo Aniello, Silvia Bonomi, Marta Breno, and Roberto Baldoni. Assessing Data Availability of Cassandra in the Presence of non-accurate Membership. In Proc. of the 2nd ACM International Workshop on Dependability Issues in Cloud Computing, 2013. Google Scholar
  5. Apache. Cassandra. URL: http://cassandra.apache.org/.
  6. Roberto Baldoni, Marco Platania, Leonardo Querzoni, and Sirio Scipioni. A peer-to-peer filter-based algorithm for internal clock synchronization in presence of corrupted processes. In Proc. of the IEEE PRDC, pages 64-72, 2008. Google Scholar
  7. Alysson Bessani, João Sousa, and Eduardo EP Alchieri. State machine replication for the masses with BFT-SMaRt. In Proc. of the Annual IEEE/IFIP DSN, pages 355-362, 2014. Google Scholar
  8. Ken Birman and Roy Friedman. Trading Consistency for Availability in Distributed Systems. Technical Report TR96-1579, Cornell University, 1996. Google Scholar
  9. Edward Bortnikov, Maxim Gurevich, Idit Keidar, Gabriel Kliot, and Alexander Shraer. Brahms: Byzantine resilient random membership sampling. Computer Networks, 53(13):2340-2359, 2009. Google Scholar
  10. Christian Cachin, Dan Dobre, and Marko Vukolić. Separating data and control: Asynchronous BFT storage with 2t+ 1 data replicas. In Stabilization, Safety, and Security of Distributed Systems, pages 1-17. Springer, 2014. Google Scholar
  11. Miguel Castro and Barbara Liskov. Practical Byzantine fault tolerance and proactive recovery. ACM TOCS, 20(4):398-461, 2002. Google Scholar
  12. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C Hsieh, Deborah A Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E Gruber. Bigtable: A distributed storage system for structured data. ACM TOCS, 26(2), 2008. Google Scholar
  13. Allen Clement, Manos Kapritsos, Sangmin Lee, Yang Wang, Lorenzo Alvisi, Mike Dahlin, and Taylor Riche. Upright cluster services. In 22nd ACM SOSP, pages 277-290, 2009. Google Scholar
  14. Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. Benchmarking cloud serving systems with YCSB. In Proc. of the ACM Symposium on Cloud Computing, pages 143-154, 2010. Google Scholar
  15. Miguel Correia, Daniel Gómez Ferro, Flavio P. Junqueira, and Marco Serafini. Practical Hardening of Crash-tolerant Systems. In Proc. of the USENIX Annual Technical Conference, ATC, pages 41-41, 2012. Google Scholar
  16. Inc DataStax. Apache Cassandra 2.2. URL: http://docs.datastax.com/en/cassandra/2.2/.
  17. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: Amazon’s highly available key-value store. ACM OSR, 41(6):205-220, 2007. Google Scholar
  18. John R Douceur. The sybil attack. In Peer-to-peer Systems, pages 251-260. Springer, 2002. Google Scholar
  19. Cynthia Dwork, Nancy Lynch, and Larry Stockmeyer. Consensus in the Presence of Partial Synchrony. J. ACM, 35(2):288-323, 1988. Google Scholar
  20. D Eastlake and Tony Hansen. US secure hash algorithms (SHA and HMAC-SHA). Technical report, RFC 4634, July, 2006. Google Scholar
  21. Christof Fetzer and Flaviu Cristian. Integrating external and internal clock synchronization. Real-Time Systems, 12(2):123-171, 1997. Google Scholar
  22. Roy Friedman and Roni Licher. Hardening cassandra against byzantine failures. CoRR, abs/1610.02885, 2016. URL: http://arxiv.org/abs/1610.02885.
  23. Aishwarya Ganesan, Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. Redundancy Does Not Imply Fault Tolerance: Analysis of Distributed Storage Reactions to Single Errors and Corruptions. In Proc. of the 15th Usenix Conference on File and Storage Technologies, FAST, pages 149-165, 2017. Google Scholar
  24. Rachid Guerraoui, Nikola Knežević, Vivien Quéma, and Marko Vukolić. The next 700 BFT protocols. In Proc. of the 5th ACM European Conference on Computer Systems, pages 363-376, 2010. Google Scholar
  25. Maurice Peter Herlihy. Replication methods for abstract data types. Technical report, DTIC Document, 1984. Google Scholar
  26. Håvard D Johansen, Robbert Van Renesse, Ymir Vigfusson, and Dag Johansen. Fireflies: A secure and scalable membership and gossip service. ACM TOCS, 33(2), 2015. Google Scholar
  27. Don Johnson, Alfred Menezes, and Scott Vanstone. The elliptic curve digital signature algorithm (ECDSA). International Journal of Information Security, 1(1):36-63, 2001. Google Scholar
  28. Avinash Lakshman and Prashant Malik. Cassandra: a decentralized structured storage system. ACM SIGOPS OSR, 44(2):35-40, 2010. Google Scholar
  29. Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558-565, 1978. Google Scholar
  30. Leslie Lamport, Robert Shostak, and Marshall Pease. The Byzantine generals problem. ACM Trans. on Programming Languages and Systems (TOPLAS), 4(3):382-401, 1982. Google Scholar
  31. Barbara Liskov and Rodrigo Rodrigues. Byzantine clients rendered harmless. In Distributed Computing, pages 487-489. Springer, 2005. Google Scholar
  32. Prince Mahajan, Srinath Setty, Sangmin Lee, Allen Clement, Lorenzo Alvisi, Mike Dahlin, and Michael Walfish. Depot: Cloud storage with minimal trust. ACM TOCS, 29(4), 2011. Google Scholar
  33. Dahlia Malkhi and Michael Reiter. Byzantine quorum systems. Distributed Computing, 11(4):203-213, 1998. Google Scholar
  34. David Mills, Jim Martin, Jack Burbank, and William Kasch. Network time protocol version 4: Protocol and algorithms specification. IETF RFC5905, June, 2010. Google Scholar
  35. F. Pedone, N. Schiper, and J. E. Armendáriz-Iñigo. Byzantine Fault-Tolerant Deferred Update Replication. In Proc. of the 5th Latin-American Symposium on Dependable Computing (LADC), pages 7-16, 2011. Google Scholar
  36. Karin Petersen, Mike Spreitzer, Douglas B. Terry, Marvin Theimer, and Alan J. Demers. Flexible update propagation for weakly consistent replication. In Michel Banâtre, Henry M. Levy, and William M. Waite, editors, Proceedings of the Sixteenth ACM Symposium on Operating System Principles, SOSP 1997, St. Malo, France, October 5-8, 1997, pages 288-301. ACM, 1997. URL: http://dx.doi.org/10.1145/268998.266711.
  37. Ronald L Rivest, Adi Shamir, and Len Adleman. A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2):120-126, 1978. Google Scholar
  38. Atul Singh et al. Eclipse attacks on overlay networks: Threats and defenses. In In IEEE INFOCOM, 2006. Google Scholar
  39. Atul Singh, Pedro Fonseca, Petr Kuznetsov, Rodrigo Rodrigues, Petros Maniatis, et al. Zeno: Eventually consistent byzantine-fault tolerance. In NSDI, pages 169-184, 2009. Google Scholar
  40. Emil Sit and Robert Morris. Security considerations for peer-to-peer distributed hash tables. In Peer-to-Peer Systems, pages 261-269. Springer, 2002. Google Scholar
  41. Robbert Van Renesse, Yaron Minsky, and Mark Hayden. A gossip-style failure detection service. In Middleware, pages 55-70. Springer, 1998. Google Scholar