Is the coordinator loading the segments across historical nodes correctly in my setup? (Review help)

Hello,

I have two historical nodes, each one with 1.1TB of disk space available, for a total of 2.2TB available for segment cache.

I have three data sources with segments that add up to ~1.3TB of disk space.

It seems that the coordinator is trying to replicate the data sources across both historical nodes instead of distributing the segments so that all of them are loaded in at least one historical node. This results in only two data sources being loaded fully, while the third data source is loaded less than 99%.

Have I misconfigured something or is this the default coordinator behavior?

I am using Druid 9.1.1

Thank you in advance for any help.

Hi,
the default druid load rule set to a replication factor of 2.

You override behavior you can add a LoadForeverRule with replication factor of 1 (NOTE: with replication factor of 1 the historicals will not be HA)

More docs about rules can be found here - http://druid.io/docs/latest/operations/rule-configuration.html

Thank you Nishant. This was exactly what I was looking for. One of the data sources is still only used in development, and I don’t need HA for it.

I currently have 2 historical nodes with replication factor 2 and I’m about to add third node.

Should I first change replication factor to 3 and then add the node or the other way around?

Do I need to do anything else besides just adding a node and change replication factor?

Hi Jakub - your questions indicate that there is some confusion over the purpose of a replication factor (in a druid cluster or any distributed data framework). You do not need to adjust the replication factor to add members to the historical node cluster.

Data replication serves to provide data redundancy/availability should a historical node fall out of the cluster. With 2 nodes and a RF=2, one node can fail and the other node will attempt to take over responsibilities for the failed node. This assumes that both nodes have each been provisioned resources to handle the entire system workload.

With 3 nodes and a RF=2, one node can fail and the other 2 remaining nodes will attempt to take over the workload previously assigned to the failed node. This assumes that all 3 nodes have been provisioned resources to handle half of the total system workload.

With 3 nodes and a RF=3, two nodes can fail and 1 remaining node will attempt to take over the workload previously assigned to the failed nodes. This assumes that all 3 nodes have each been provisioned resources to handle the entire system workload.

With a low node count (less than tens) you’ll probably want to stick with a replication factor of 2.

The HA/failover mechanisms are largely subject to the business rules of the coordinator (http://druid.io/docs/0.10.1/design/coordinator.html)

Well, I wanted to avoid rebalancing in case one node goes down or it is stopped as it is extremely slow on Druid.
And we have plenty of disk space. So my thinking was, why not having all segments replicated to all nodes because

no rebalancing would happen in case a node crashes + load would be more equally distributed. As if some query targets interval that doesn’t reside on one node at all, it would perform slower, right?

Hey Jakub,

“we have plenty of disk space […] why not have all segments replicated to all nodes”

This situation is simply not possible on most large production deployments due to total segment size and cost of high performance disk. But if this true for your environment, then sure you can replicate all segments to all historical nodes.

Cheers

Kyle do you know why is rebalancing so slow?

I have added third historical node with replication factor 2 in the end and 20 000 segments got rebalanced in 8 hours, there was pretty much no load on historical,brokers,coordinator nodes.

Hi Jakub,
I noticed that load speed depends on more factors. You mentioned that there was no load on historical. You should check cpu utilization instead and also network utilization.

CPU util can be high even with low load. I noticed that CPU utilization is constantly at 25% during loading. This is because our system has 4 CPU’s and druid uses only 1 CPU for loading.

If you load segments from S3, it could be another bottleneck.

David

Dne pondělí 30. října 2017 16:58:24 UTC+1 Jakub Liska napsal(a):

Hi,

where can i change that replicationfactor while submitting ingestion spec via command line?

Hi,

The easy way is to change the replication faction from the coordinator console(port 8090) or by the druid console(port 8888). Edit the load rule and modify the replication factor.