Improving the Latency and Throughput of ZooKeeper Atomic Broadcast

EL-Sanosi, Ibrahim; Ezhilchelvan, Paul

doi:10.4230/OASIcs.ICCSW.2017.3

File

OASIcs.ICCSW.2017.3.pdf

Filesize: 465 kB
10 pages

Document Identifiers

DOI: 10.4230/OASIcs.ICCSW.2017.3
URN: urn:nbn:de:0030-drops-84452

Author Details

Ibrahim EL-Sanosi

Paul Ezhilchelvan

Cite AsGet BibTex

Ibrahim EL-Sanosi and Paul Ezhilchelvan. Improving the Latency and Throughput of ZooKeeper Atomic Broadcast. In 2017 Imperial College Computing Student Workshop (ICCSW 2017). Open Access Series in Informatics (OASIcs), Volume 60, pp. 3:1-3:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/OASIcs.ICCSW.2017.3

Abstract

ZooKeeper is a crash-tolerant system that offers fundamental services to Internet-scale applications, thereby reducing the development and hosting of the latter. It consists of >3 servers that form a replicated state machine. Maintaining these replicas in a mutually consistent state requires executing an Atomic Broadcast Protocol, Zab, so that concurrent requests for state changes are serialised identically at all replicas before being acted upon. Thus, ZooKeeper performance for update operations is determined by Zab performance. We contribute by presenting two easy-to-implement Zab variants, called ZabAC and ZabAA. They are designed to offer small atomic-broadcast latencies and to reduce the processing load on the primary node that plays a leading role in Zab. The former improves ZooKeeper performance and the latter enables ZooKeeper to face more challenging load conditions.

Keywords

Atomic Broadcast
Server Replication
Protocol Latency
Throughput

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Philip A Bernstein and Eric Newcomer. Principles of transaction processing. Morgan Kaufmann, 2009.
Martin Biely, Zoran Milosevic, Nuno Santos, and Andre Schiper. S-paxos: Offloading the leader for high throughput state machine replication. In IEEE 31st Symposium on Reliable Distributed Systems (SRDS), pages 111-120, 2012.
Lars George. HBase: the definitive guide. " O'Reilly Media, Inc.", 2011.
Patrick Hunt, Mahadev Konar, Flavio Paiva Junqueira, and Benjamin Reed. Zookeeper: Wait-free coordination for internet-scale systems. In USENIX Annual Technical Conference, volume 8, page 9, 2010.
Flavio P Junqueira, Benjamin C Reed, and Marco Serafini. Zab: High-performance broadcast for primary-backup systems. In IEEE/IFIP 41st International Conference on Dependable Systems &Networks (DSN), pages 245-256. IEEE, 2011.
Leslie Lamport. Fast paxos. Distributed Computing, 19(2):79-103, 2006.
Yanhua Mao, Flavio Paiva Junqueira, and Keith Marzullo. Mencius: building efficient replicated state machines for wans. In OSDI, volume 8, pages 369-384, 2008.
Pedro Ruivo, Maria Couceiro, Paolo Romano, and Luis Rodrigues. Exploiting total order multicast in weakly consistent transactional caches. In IEEE 17th Pacific Rim International Symp. on Dependable Computing (PRDC), 2011, pages 99-108, 2011.
Konstantin Shvachko, Hairong Kuang, Sanjay Radia, and Robert Chansler. The hadoop distributed file system. In IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 2, pages 1-10, 2010.
Robbert Van Renesse and Fred B Schneider. Chain replication for supporting high throughput and availability. In OSDI, volume 4, pages 91-104, 2004.