How to manage failover in ZooKeeper across datacenters using observers

Question

I have an application running in 3 different datacenters that uses ZooKeeper for many tasks. Following the recommended practices, we have deployed three ZooKeeper ensembles where one datacenter contains common ZooKeeper instances and the other two are only observers of the first datacenter.

DC1: Usual leader/follower ensemble
DC2: Observers of DC1
DC3: Observers of DC1

As explained in ZooKeeper's docs, only the leader machine is able to accept write requests, so followers and observers would route these messages to the leader first. In case the leader becomes unresponsive, an available follower will be elected as the new leader and the ZooKeeper ensemble will remain up.

However I haven't find any reference on how we could manage entire datacenters becoming offline. For instance, if our leader/follower ensemble on datacenter 1 is unavailable, how could we make the second datacenter become the primary with usual leader/follower machines? Would I have to take a node down, change their configuration file to become a usual node, turn it on again and then replace all the other ZooKeeper machines to follow this leader? Is there any automatic system for that?

score -1 · Answer 1 · answered Feb 10 '17 at 14:05

-1

check out ZooKeeper Dynamic Reconfiguration

answered Feb 10 '17 at 14:05

treehouse

241
3
7

Thanks for your answer, Kai Wang. Unfortunately, dynamic reconfiguration is only available for ZooKeeper version 3.5.0, whose stable version wasn't released yet. I was looking for a better solution with 3.4.9 to not risk my service to unsolved bugs in the unstable version. – Matheus Portela Feb 20 '17 at 18:51
This is a good example of why answers should have more than just a link - links tend to change or pages get removed. New link available here but lilkely to change if version changes - https://zookeeper.apache.org/doc/r3.5.3-beta/zookeeperReconfig.html – Titi Jan 11 '18 at 16:39

How to manage failover in ZooKeeper across datacenters using observers

1 Answers1