Multi-node Druid cluster for HA

Hi ,

I am trying to setup a fault-tolerant and scalable druid cluster. I have gone through druid as well as Imply documentation and some earlier forum posts about HA. Following is my understanding and Few queries. Would be great if anyone can pitch in.

I currently have:

  • 3 node Zookeeper cluster

  • Mysql with replication and failover as Metadata store

- NFS for Deep storage (druid.storage.type=local).

- Single node Druid setup using docker-druid image.

I am ingesting data into druid using tranquility core from my application.

I have tried taking down the single node which runs coordinator, overlord, historical and broker as specified in supervisor.conf and ensuring that once the node come up, data is retained.

From what i have read, for HA we will need:

  • 2 Overlord nodes

  • 2 coordinator nodes

  • 2+ Broker nodes

  • 2+ Historical nodes

  • 2+ Middle manager nodes

Also :

  1. Overlord and coordinator works in failover mode i.e. only one would be active at a time. So 2 of each should suffice

  2. Broker nodes can be scaled independently depending on the load. Atleast 2 are required to ensure HA

  3. Historical nodes also needs to be scaled depending on load.

  4. Broker nodes can be put behind a load balancer

Questions:

  1. The docker-druid image i am using doesn’t run a middle manager node. When would middle manager nodes be required?

  2. Does the overlord and coordinator node failover configuration needs to be specified anywhere? or just starting 2 overlord and 2 coordinator pointing to same zookeeper, Metadata store and Deep storage be sufficient and zookeeper would take care of choosing one as leader and keeping other node as standby?

  3. How can High Availability be achieved for Real-time tasks that are created by tranquility? Which node type need to be scaled for achieving this? i.e. If a node running real-time task goes down before the task runtime is over, the data indexed by this node never gets pushed to deep storage. How can this be avoided by replicating the task for HA?

  4. Is any configuration required for historical nodes or just starting multiple of them is enough with druid taking care of coordination?

Thanks,

Prathamesh

Answers inline.

Hi ,

I am trying to setup a fault-tolerant and scalable druid cluster. I have gone through druid as well as Imply documentation and some earlier forum posts about HA. Following is my understanding and Few queries. Would be great if anyone can pitch in.

I currently have:

  • 3 node Zookeeper cluster
  • Mysql with replication and failover as Metadata store

- NFS for Deep storage (druid.storage.type=local).

- Single node Druid setup using docker-druid image.

I am ingesting data into druid using tranquility core from my application.

I have tried taking down the single node which runs coordinator, overlord, historical and broker as specified in supervisor.conf and ensuring that once the node come up, data is retained.

From what i have read, for HA we will need:

  • 2 Overlord nodes
  • 2 coordinator nodes
  • 2+ Broker nodes
  • 2+ Historical nodes
  • 2+ Middle manager nodes

Also :

  1. Overlord and coordinator works in failover mode i.e. only one would be active at a time. So 2 of each should suffice
  1. Broker nodes can be scaled independently depending on the load. Atleast 2 are required to ensure HA
  1. Historical nodes also needs to be scaled depending on load.
  1. Broker nodes can be put behind a load balancer

Correct.

Questions:

  1. The docker-druid image i am using doesn’t run a middle manager node. When would middle manager nodes be required?

MiddleManagers are requires to run ingestion tasks. You will need them if you want to run any ingestion tasks. See - http://druid.io/docs/latest/design/middlemanager.html

  1. Does the overlord and coordinator node failover configuration needs to be specified anywhere? or just starting 2 overlord and 2 coordinator pointing to same zookeeper, Metadata store and Deep storage be sufficient and zookeeper would take care of choosing one as leader and keeping other node as standby?

Just starting multiple on different machines is enough. No separate configs needed.

  1. How can High Availability be achieved for Real-time tasks that are created by tranquility? Which node type need to be scaled for achieving this? i.e. If a node running real-time task goes down before the task runtime is over, the data indexed by this node never gets pushed to deep storage. How can this be avoided by replicating the task for HA?

tranquility has a config to create replica tasks in Beam configs.

  1. Is any configuration required for historical nodes or just starting multiple of them is enough with druid taking care of coordination?

By default druid tries to load data on historical nodes with a replication of 2.

If you need to change the replication factor, it can be done via coordinator console by specifying rules for specific datasource. See http://druid.io/docs/latest/operations/rule-configuration.html

Hi Nishant,

Just one follow up question.

MiddleManagers are requires to run ingestion tasks. You will need them if you want to run any ingestion tasks. See - http://druid.io/docs/latest/design/middlemanager.html

By ingestion task, do you mean kafka indexing task or does this apply to real-time ingestion tasks created by tranquility? I have seen that there is no Middle Manager used in docker-druid image and ingestion using tranquility library from application works fine so probably in that case Middle Managers are not required. Just want to confirm this.

Thanks a lot,

Prathamesh

Found an answer to this. The docker-druid image uses "“local” mode for running tasks which means tasks are executed on overlord itself and not handed out to midddle managers. Thus middle managers are not required when using the default configuration provided by docker-druid image.

Thanks,

Prathamesh