Apache ZooKeeper

Apache ZooKeeper

Developer	Apache Software Foundation

Stable release	3.9.4-2^[1] / 2025-08-19; 3 months ago

Repository	ZooKeeper Repository
Written in	Java
Operating system	Cross-platform
Type	Distributed computing
License	Apache License 2.0
Website	zookeeper.apache.org

Overview

Summarize

Perspective

ZooKeeper's architecture supports high availability through redundant services. Clients can ask another ZooKeeper leader if the first fails to answer. ZooKeeper nodes store their data in a hierarchical name space, like a file system or a tree data structure. Clients can read from and write to the nodes and in this way have a shared configuration service. ZooKeeper can be viewed as an atomic broadcast system, through which updates are totally ordered. The ZooKeeper Atomic Broadcast (ZAB) protocol is the core of the system.^[4]

ZooKeeper is used by companies including Yelp, Rackspace, Yahoo!,^[5] Odnoklassniki, Reddit,^[6] NetApp SolidFire,^[7] Meta,^[8] Twitter^[9] and eBay as well as open source enterprise search systems like Solr and distributed database systems like Apache Pinot.^[10]^[11]

ZooKeeper is modeled after Google's Chubby lock service^[12]^[13] and was originally developed at Yahoo! for streamlining the processes running on big-data clusters by storing the status in local log files on the ZooKeeper servers. These servers communicate with client machines to deliver the required information. ZooKeeper was developed to address issues that emerged during the deployment of distributed big-data applications.

Some of the prime features of Apache ZooKeeper are:

Reliable System: the system keeps working even if some nodes stop working.
Simple Architecture: there is a shared hierarchical namespace which helps coordinating the processes.
Fast Processing: especially fast in "read-dominant" workloads (i.e. workloads in which reads are much more common than writes).
Scalable: performance can be improved by adding nodes.

Remove ads

Architecture

Some common terminologies regarding the ZooKeeper architecture:

Node: the systems installed on the cluster
ZNode: the nodes where the status is updated by other nodes in cluster
Client applications: the tools that interact with the distributed applications
Server applications: allow the client applications to interact using a common interface

The services in the cluster are replicated and stored on a set of servers (called an "ensemble"), each of which maintains an in-memory database containing the entire data tree of state as well as a transaction log and snapshots stored persistently. Multiple client applications can connect to a server, and each client maintains a TCP connection through which it sends requests and heartbeats and receives responses and watch events for monitoring.^[14]

Remove ads

Client libraries

In addition to the client libraries included with the ZooKeeper distribution, several third-party libraries, including Apache Curator and Kazoo, extend ZooKeeper's capabilities. These libraries offer enhanced ease of use, additional features and support for a broader range of programming languages.

Overview

Architecture

Use cases

Client libraries

Apache projects using ZooKeeper

See also

References

External links

Wikiwand - on