Hi there,
I’m trying to understand how Druid provide fault tolerance. Based on my understanding
-
Historical node: (If a historical node goes down, druid-coordinator will try to place the segments to another historical node. Historical node in the case are slave in typical master-slave architecture)
-
Broker. (We can have multiple broker nodes behind load balancers. Broker in this case a typical client)
-
Druid-MiddleManager. (Druid middle manager can have replication. So if replication factor is 2, Overlord will create 2 peons in MiddleManager. This is typical master-master fan out write. If one peon goes down, Druid will run with single peon for that segment granularity period.
-
Tranquility: A typical kafka consumer in my case. So all properties to kafka consumer group apply here.
things I’m not clear.
-
Druid-coordinator: Is it active-passive fault tolerance? In case of active coordinator failed how the passive node gat control?
-
Druid overlord?
Thank you.