Druid Coordinator delay to re-assign segments

from - https://druid.apache.org/docs/latest/design/coordinator.html

Segment Availability

If a Historical process restarts or becomes unavailable for any reason, the Druid Coordinator will notice a process has gone missing and treat all segments served by that process as being dropped. Given a sufficient period of time, the segments may be reassigned to other Historical processes in the cluster. However, each segment that is dropped is not immediately forgotten. Instead, there is a transitional data structure that stores all dropped segments with an associated lifetime. The lifetime represents a period of time in which the Coordinator will not reassign a dropped segment. Hence, if a historical process becomes unavailable and available again within a short period of time, the historical process will start up and serve segments from its cache without any those segments being reassigned across the cluster.

Question 1: Is there a certain configuration for setting the value to control how long the coordinator will wait to re-assign segments?

Question 2: If we are rolling restarting the historical nodes in our cluster, are the segments that go from 2/2 replicas to 1/2 replicas during the restart get re-assigned based on this same logic described above? We don’t want segments to be replicated elsewhere in the cluster when we are just restarting the nodes to roll out configuration changes.

Hi Lucas,

  1. Yes, can you try extending the time periods on these two parameters, druid.coordinator.period and druid.manager.segments.pollDuration , and see if they help? https://druid.apache.org/docs/latest/configuration/index.html#coordinator

  2. Yes, the logic also applies, so if needed, you can extend the checking period longer, and revert the change when rolling upgrade is done.




druid.coordinator.period seems to be what you are looking for.

During rolling restart, in general the startup is faster and the segments won’t be re-assigned.

But in case you are decommissioning a node, you can review decommissioningNodes in co-ordinator dynamic configuration.

Thanks & Regards


So this solution seems to have a few holes in it.

If I extend those two attributes to, say, 10 minutes… That means I won’t load newly indexed segments in an optimal manner because they will be sitting and waiting. All the while if coordination finishes and starts a new round while a node has just started re-announcing segments, the coordinator will wrongly assign all of that nodes segments to be loaded elsewhere even though they will be announced shortly. Thus creating an un-necessary backup in the loading queue for segments.