Druid 0.9.0 - While merging segments large number of waiting/FAILED tasks


I was testing auto segment merge option of coordinator in our environment and was looking at overlord console. I found that

  • there were a lot of waiting tasks - tasks waiting on locks (2000+)
  • in the list of complete tasks there were a lot of tasks with FAILED status. About 50%. When I look at the logs for the tasks it says “No log was found for this task. The task may not exist, or it may not have begun running yet.”.
    What does this signify about my setup?

My thoughts about the waiting tasks was that maybe I am submitting the tasks too fast and that is causing duplicates to be submitted which can cause waits. Same for FAILED. Maybe the duplicate tasks were dropped as they were duplicates.

If the above reasoning is incorrect then what could be the reason?

I found out the reason for this. I had tweaked the peon JVM heap size to be quite less 256m. The merge requests that were failing had segments with higher size. Maybe the duplicate issue also but not sure about that. Anyone?