un-killable "zombie" tranq indexing tasks

Hi all,

We’re using tranquility 0.5.0 / druid 0.7.3.

We recently experienced some issues with our ZK cluster, and had to restart our druid cluster as well.

After bringing everything back up, we noticed something pretty strange. All inflight tasks were killed (as expected), and all middlemanagers reported no active tasks, except for ONE middlemanager. One middlemanager still reported that 17 tasks were still active. The tasks were all from the same timerange (02-19T06:00), around the time we restarted things. I hopped on that middlemanager to verify if the tasks were running, and discovered that the none of the peons associated w/ those tasks were actually running anymore. Both the overlord & middlemanager know about the tasks, and both try to shut them down:

On overlord startup, we see this. It looks like the overlord knows these tasks are old, but it can’t shut them down.

Hey Jason,

Tranquility 0.5.0 and Druid 0.7.3 are quite old and there have been a few fixes since then related to task announcing and assignment. Could you try upgrading?

Hi Gian, thanks for the reply.

Upgrading to druid 0.8.3 is on our roadmap, and we’re currently testing it out in our staging environment. Unfortunately, it will still take some time for us to upgrade our prod druid cluster. Is there anything we can do, at least right now, to remove these tasks?

Thanks again,

-Jason

Does restarting the middleManager help?

It doesn’t appear to help. When I shutdown that middleManager, the zombie tasks no longer appear in the overlord. When I bring that middleManager back up, the zombie tasks show up again. Also, right after the middleManager starts, we see the “Ignoring request to cancel unknown task” message repeating multiple times in the logs.

Thanks,

-Jason

The version is so old that I think it’ll be difficult for us to debug anything with setting aside dedicated time. You can try wiping the task entry from metadata store and see if that helps.

If you need dedicated help, you can try http://imply.io/