Is there a way to gracefully remove an historical node?

If I have 5 historical nodes and want to remove one, is there a way to gracefully reassign all of its segments before terminating the node?

The naive scale-down approach would be to just kill the node, knowing that the coordinator would eventually reassign all of its segments.
But you give up some high-availability as some segments will only have one copy until the reassignment is complete.

I’m not sure if there’s any better scale-down procedure for that maintains high-availability for all segments.

Hi,

does the segment replication works for you? For example, you can increase the number of replica from 1 to let’s say 2, waiting for all segments to be replicated, halting the historical. If you halt historicals one by one, you can expect that all segments are always available on remaining historicals.

Does this make sense?

Jihoon

Thanks, yes the default replication for all datasources may be 2x, so I can “safely” kill any historical node.
But it puts the system in a compromised state for a while, where not all of the segments are highly-available.
And perhaps queries could be impacted since the broker is caught off-guard by the change.

I wasn’t sure if there was some way to tell the coordinator to mark an historical node for removal, so it would then start reassigning all of its segments to other nodes.
That way the historical node would become completely unused before we terminate it.

Hi Jim,

I think we don’t support it yet, but it sounds useful to me!

Jihoon

Hey, This PR should be of intereset https://github.com/apache/incubator-druid/pull/6349

OMG, 6349 looks exactly the function Jim wants.
Thanks for pointing that out!