Improving the Latency and Throughput of ZooKeeper Atomic Broadcast

Authors Ibrahim EL-Sanosi, Paul Ezhilchelvan

Thumbnail PDF


  • Filesize: 465 kB
  • 10 pages

Document Identifiers

Author Details

Ibrahim EL-Sanosi
Paul Ezhilchelvan

Cite AsGet BibTex

Ibrahim EL-Sanosi and Paul Ezhilchelvan. Improving the Latency and Throughput of ZooKeeper Atomic Broadcast. In 2017 Imperial College Computing Student Workshop (ICCSW 2017). Open Access Series in Informatics (OASIcs), Volume 60, pp. 3:1-3:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


ZooKeeper is a crash-tolerant system that offers fundamental services to Internet-scale applications, thereby reducing the development and hosting of the latter. It consists of >3 servers that form a replicated state machine. Maintaining these replicas in a mutually consistent state requires executing an Atomic Broadcast Protocol, Zab, so that concurrent requests for state changes are serialised identically at all replicas before being acted upon. Thus, ZooKeeper performance for update operations is determined by Zab performance. We contribute by presenting two easy-to-implement Zab variants, called ZabAC and ZabAA. They are designed to offer small atomic-broadcast latencies and to reduce the processing load on the primary node that plays a leading role in Zab. The former improves ZooKeeper performance and the latter enables ZooKeeper to face more challenging load conditions.
  • Atomic Broadcast
  • Server Replication
  • Protocol Latency
  • Throughput


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Philip A Bernstein and Eric Newcomer. Principles of transaction processing. Morgan Kaufmann, 2009. Google Scholar
  2. Martin Biely, Zoran Milosevic, Nuno Santos, and Andre Schiper. S-paxos: Offloading the leader for high throughput state machine replication. In IEEE 31st Symposium on Reliable Distributed Systems (SRDS), pages 111-120, 2012. Google Scholar
  3. Lars George. HBase: the definitive guide. " O'Reilly Media, Inc.", 2011. Google Scholar
  4. Patrick Hunt, Mahadev Konar, Flavio Paiva Junqueira, and Benjamin Reed. Zookeeper: Wait-free coordination for internet-scale systems. In USENIX Annual Technical Conference, volume 8, page 9, 2010. Google Scholar
  5. Flavio P Junqueira, Benjamin C Reed, and Marco Serafini. Zab: High-performance broadcast for primary-backup systems. In IEEE/IFIP 41st International Conference on Dependable Systems &Networks (DSN), pages 245-256. IEEE, 2011. Google Scholar
  6. Leslie Lamport. Fast paxos. Distributed Computing, 19(2):79-103, 2006. Google Scholar
  7. Yanhua Mao, Flavio Paiva Junqueira, and Keith Marzullo. Mencius: building efficient replicated state machines for wans. In OSDI, volume 8, pages 369-384, 2008. Google Scholar
  8. Pedro Ruivo, Maria Couceiro, Paolo Romano, and Luis Rodrigues. Exploiting total order multicast in weakly consistent transactional caches. In IEEE 17th Pacific Rim International Symp. on Dependable Computing (PRDC), 2011, pages 99-108, 2011. Google Scholar
  9. Konstantin Shvachko, Hairong Kuang, Sanjay Radia, and Robert Chansler. The hadoop distributed file system. In IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 2, pages 1-10, 2010. Google Scholar
  10. Robbert Van Renesse and Fred B Schneider. Chain replication for supporting high throughput and availability. In OSDI, volume 4, pages 91-104, 2004. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail