[0.9.0] Multiple Overlord leader in ZK causing conflicting behavior and index task failure

I’m seeing multiple instances where 2 overlords thinking they are the leader and trying to update tasks at the sametime and causing errors. I’ve read https://github.com/druid-io/druid/issues/3046, seems to be related. I see this more often in 0.9.0 but I don’t recall seeing this in 0.8.3.
Is this a specific 0.9.0 bug or it exists for previous versions just not showing up for some reason.

Also see the fix https://github.com/druid-io/druid/pull/3050, will this fix the issue? Which version of Druid this is merged in?


https://github.com/druid-io/druid/commit/474286bbce500cf7195919396128a752acc7704a is in 0.9.2 which has not started release cycle yet.

Between https://github.com/druid-io/druid/pull/3050 and https://github.com/druid-io/druid/issues/3237 I think an 0.9.2 sooner rather than later might be a good idea.

also https://github.com/druid-io/druid/pull/3228

Thanks Charles for the info, so till 0.9.2, we might still see similar problems overtime? Any suggestion reduce the chance here, my guess would be rolling restart or something else.

Also this https://github.com/druid-io/druid/pull/3228 seems to be related to compression in indexing, does it cause the double-overlord problem?

No, but it is another improvement which makes a lot of sense to go in.