[druid-user] Stuck Segments In Historicals

Hey, this is a bug I’ve seen for years and I’m curious about whether or not others have seen it as well since we have a fair number of customizations. Has anyone else noticed a tendency for long-running or very underutilized historicals to “grab onto” segments and refuse to decommission? For example, a brand new cluster with <100 segments on a historical, when set to decommission, will move out until it gets to some small number of segments (1-20 or so in my experience,) and then just stop moving those segments off until the Druid historical service is restarted on there.

The fact that just restarting Druid with no other changes in the cluster makes me think it’s not an issue with the MySQL store or anything off of the box itself, so I’m not sure where I’d even begin to start looking for the issue. If anyone has pointers there, I’d appreciate it.

Start with the coordinator and historical logs and check on the sys.segments and sys.server_segments contents for the segments that are not getting decommissioned. Perhaps it is running into an error or a timeout somewhere.