Hey Jithin, responses inline.
I’ve been setting up a Druid cluster to ingest data from Kafka and I was wondering if there’s a recommended way to cluster the different types of Druid nodes.
Currently, my cluster is set up as follows (based on the documentation of the imply-1.3.0 package):
- X nodes that each run Historical and MiddleManager processes
- Y nodes that each run Overlord and Coordinator processes
- Z nodes that each run a Broker process
Questions (related to performance tuning)
(a) Is it recommended to run Historical and MiddleManager processes on the same nodes? Or should they be separated?
For most use cases this works well. You can consider separating them and scaling them independently if you need to have very fine control over resource allocation.
(b) Same question - corresponding to Overlord and Coordinator processes?
The overlord and coordinator have relatively low resource requirements and can definitely be run on the same machine. You might consider having a second overlord/coordinator pair on another machine for high availability (if the primary overlord/coordinator fails, the secondary one will take over).
© My understanding is that X is the only number that needs to be increased in order to scale
(i) Kafka real-time ingestion throughput AND
(ii) Query performance
Is that correct? Does Z also need to be increased in order to improve query performance?
X is the most important node type to scale for query and ingestion performance. If you have a very high query load, you’ll also eventually need to increase Z. When exactly you’ll need to increase Z depends on your queries, but in general I think ratios on the order of 10:1 or higher aren’t unreasonable. Monitoring metrics from Druid will help here (http://druid.io/docs/0.9.1.1/operations/metrics.html). If you want high availability on the query side, you’ll want to have at least 2 brokers which can be put behind a load balancer.
(d) Would it be fine to set Y = 1?
Yes, but as above, for HA you’ll want at least 2.