0

I have run an ensemble of 6 zookeeper nodes to coordinate 4 kafka instances. This cluster is split across 2 distinct networks zones. Each zone contains:

  • 2 servers with zookeeper + kafka instances

  • 1 server with just only one zookeeper instance running on it.

Considering in the end I have 6 zookeeper ensembles and as already answered before I was expecting in my redundancy tests get my zookeeper/kafka running properly even if I killed 3 zookeeper processes(each one in a different server). But what I have noticed is I could just kill at max 2 zookeeper processes until the ensemble fails.

I have all config files of kafka and zookeeper properly written and considering all the 6 servers as my whole cluster. What do you think that could be? I´d like to have any clue that could help me to understand what is wrong here.

Saulo Ricci
  • 776
  • 1
  • 8
  • 27
  • 1
    Zookeeper needs majority of nodes to continue operate -- 3/6 alive nodes is obviously not a majority, 4/6 is. The common tip is to have uneven number of nodes, as even number does not provide extra redundancy, only drags down latencies as there is a bit more talking between peers. – om-nom-nom Sep 10 '15 at 19:44
  • @om-nom-nom Does zookeeper always need a majority from the configured server list? Or, is it possible to only need a majority of in-sync servers? – Saulo Ricci Sep 10 '15 at 23:14
  • This is the official doc reference for the even vs. odd number of machines: http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html#sc_zkMulitServerSetup This question has some good info and references: http://stackoverflow.com/questions/13022244/zookeeper-reliability-three-versus-five-nodes – Matthew Daumen Sep 11 '15 at 04:02

0 Answers0