[druid-user] Tasks are failing with runnerstatuscode WAITING and error "No task in the corresponding pending completion taskGroup"

Hi All,

Few indexing and compact tasks are failing after a waiting time with error message “No task in the corresponding pending completion taskGroup”. Seems like it’s not affecting the data availability but I can see random unloaded segments with very minimal size and unknown rows information. Did any one face this issue,Please let me know…

Cluster - 5 data nodes + 1 Master + 1 query node
Druid version 34.0.2

Troubleshooting done so far
1 Increased the Task completion time out -
2 Tuned the capacity as per recommendations
3 Rebooted all cluster nodes.

Error message

{
“id”: “index_kafka_XXXXXXXXXXXX”,
“groupId”: “index_kafka_XXXXXXXXXXXXXXXXXXXX”,
“type”: “index_kafka”,
“createdTime”: “2023-01-25T03:04:07.563Z”,
“queueInsertionTime”: “1970-01-01T00:00:00.000Z”,
“statusCode”: “FAILED”,
“status”: “FAILED”,
“runnerStatusCode”: “WAITING”,
“duration”: -1,
“location”: {
“host”: “drulx1005”,
“port”: 8100,
“tlsPort”: -1
},
“dataSource”: “XXXXXXXXXXXXXXXXXXX”,
“errorMsg”: “No task in the corresponding pending completion taskGroup[0] succeeded before completion timeout ela…”
}

Regards
Rajesh

Hi Rajesh,
Can you share the failed task’s log?
-Sergio