Apache ZooKeeper

Apache ZooKeeper
Developer(s)	Apache Software Foundation
Stable release	3.6.1 / April 30, 2020[1]
Repository	ZooKeeper Repository
Written in	Java
Operating system	Cross-platform
Type	Distributed computing
License	Apache License 2.0
Website	zookeeper.apache.org

Apache ZooKeeper is a software project of the Apache Software Foundation. It is essentially a service for distributed systems offering a hierarchical key-value store, which is used to provide a distributed configuration service, synchronization service, and naming registry for large distributed systems (see Use cases).[2] ZooKeeper was a sub-project of Hadoop but is now a top-level Apache project in its own right.

ZooKeeper's architecture supports high availability through redundant services. The clients can thus ask another ZooKeeper leader if the first fails to answer. ZooKeeper nodes store their data in a hierarchical name space, much like a file system or a tree data structure. Clients can read from and write to the nodes and in this way have a shared configuration service. ZooKeeper can be viewed as an atomic broadcast system, through which updates are totally ordered. The ZooKeeper Atomic Broadcast (ZAB) protocol is the core of the system.[3]

ZooKeeper is used by companies including Yelp, Rackspace, Yahoo!,[4] Odnoklassniki, Reddit,[5] NetApp SolidFire,[6] Facebook,[7] Twitter[8] and eBay as well as open source enterprise search systems like Solr.[9]

ZooKeeper is modeled after Google's Chubby lock service[10][11] and was originally developed at Yahoo! for streamlining the processes running on big-data clusters by storing the status in local log files on the ZooKeeper servers. These servers communicate with the client machines to provide them the information. ZooKeeper was developed in order to fix the bugs that occurred while deploying distributed big-data applications.

Some of the prime features of Apache ZooKeeper are:

Reliable System: This system is very reliable as it keeps working even if a node fails.
Simple Architecture: The architecture of ZooKeeper is quite simple as there is a shared hierarchical namespace which helps coordinating the processes.
Fast Processing: ZooKeeper is especially fast in "read-dominant" workloads (i.e. workloads in which reads are much more common than writes).
Scalable: The performance of ZooKeeper can be improved by adding nodes.

Apache ZooKeeper architecture

Some common terminologies regarding the ZooKeeper architecture:

Node: The systems installed on the cluster
ZNode: The nodes where the status is updated by other nodes in cluster
Client applications: The tools that interact with the distributed applications
Server applications: Allows the client applications to interact using a common interface

The services in the cluster are replicated and stored on a set of servers (called an "ensemble"), each of which maintains an in-memory database containing the entire data tree of state as well as a transaction log and snapshots stored persistently. Multiple client applications can connect to a server, and each client maintains a TCP connection through which it sends requests and heartbeats and receives responses and watch events for monitoring.[12]

Use cases

Typical use cases for ZooKeeper are:

Client libraries

In addition to the client libraries included with the ZooKeeper distribution, a number of third-party libraries such as Apache Curator and Kazoo are available that make using ZooKeeper easier, add additional functionality, additional programming languages, etc.

Apache projects using ZooKeeper

etc.

References

"Apache ZooKeeper - Releases". Retrieved 17 May 2020.
"Index - Apache ZooKeeper - Apache Software Foundation". cwiki.apache.org. Retrieved 2016-08-26.
"Zookeeper Overview".
"ZooKeeper/Powered By". Archived from the original on 2013-12-09. Retrieved 2012-01-25.
"Why Reddit was down on Aug 11".
"5 Big DaaS Challenges and How to Overcome Them | NetApp Newsroom". NetApp Newsroom. 2016-06-20. Retrieved 2017-05-24.
"Location-Aware Distribution: Configuring servers at scale". Facebook Code. 2018-07-19. Retrieved 2018-07-20.
"ZooKeeper at Twitter". Twitter Engineering Blog. 2018-10-11. Retrieved 2018-12-08.
"SolrCloud".
Burrows, Mike (2006). "The Chubby lock service for loosely-coupled distributed systems". 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI).
Chandra, Tushar Deepak; Griesemer, Robert; Redstone, Joshua (2007). "Paxos Made Live - An Engineering Perspective (2006 Invited Talk)". Google Research. Retrieved 2020-03-03.
"Zookeeper".

External links

Official website

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[releases-1] "Apache ZooKeeper - Releases". Retrieved 17 May 2020.

[2] "Index - Apache ZooKeeper - Apache Software Foundation". cwiki.apache.org. Retrieved 2016-08-26.

[3] "Zookeeper Overview".

[4] "ZooKeeper/Powered By". Archived from the original on 2013-12-09. Retrieved 2012-01-25.

[5] "Why Reddit was down on Aug 11".

[6] "5 Big DaaS Challenges and How to Overcome Them | NetApp Newsroom". NetApp Newsroom. 2016-06-20. Retrieved 2017-05-24.

[7] "Location-Aware Distribution: Configuring servers at scale". Facebook Code. 2018-07-19. Retrieved 2018-07-20.

[8] "ZooKeeper at Twitter". Twitter Engineering Blog. 2018-10-11. Retrieved 2018-12-08.

[9] "SolrCloud".

[10] Burrows, Mike (2006). "The Chubby lock service for loosely-coupled distributed systems". 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI).

[11] Chandra, Tushar Deepak; Griesemer, Robert; Redstone, Joshua (2007). "Paxos Made Live - An Engineering Perspective (2006 Invited Talk)". Google Research. Retrieved 2020-03-03.

[12] "Zookeeper".

Apache Software Foundation
Top-level projects	Accumulo ActiveMQ Airflow Ambari Ant Aries Apache HTTP Server APR Avro Axis Axis2 Beam Bloodhound Brooklyn Buildr Calcite Camel CarbonData Cassandra Cayenne Chemistry CloudStack Cocoon Cordova CouchDB cTAKES CXF Derby Directory Drill Druid Empire-db Felix Flex Flink Flume Forrest Geronimo Giraph Gump Hadoop Hama HBase Helix Hive Impala Jackrabbit James Jena Jini JMeter Kafka Karaf Kudu Kylin Lucene Mahout Marmotta Maven MINA mod_perl MyFaces NetBeans Nutch OFBiz Oozie OpenEJB OpenJPA OpenNLP OрenOffice ORC PDFBox Parquet Phoenix POI Pig Pivot Qpid Roller RocketMQ Samza ServiceMix Shiro SINGA Sling Solr Spark Stanbol Storm SpamAssassin Sqoop Struts 1 Struts 2 Subversion SystemML Tapestry Thrift Tika Tomcat Trafodion Traffic Server UIMA Velocity Wicket Xalan Xerces XMLBeans Yetus ZooKeeper
Commons	BCEL BSF Daemon Jelly Logging
Incubator	Iceberg MXNet Superset Taverna XAP
Other projects	Batik Chainsaw FOP Ivy Log4j
Attic	Abdera Apex AxKit Beehive Bluesky iBATIS Cactus Click Continuum Deltacloud Etch Excalibur Harmony HiveMind Jakarta Lenya ODE Shale Shindig Slide stdcxx Tuscany Wave Wink
Licenses	Apache License
Category