Kafka Supervisor unhealthy

Kafka Supervisor goes into Unhealthy state.

When the status of supervisors are checked, it shows follow:
Error says: “java.util.concurrent.TimeoutException”,“message”:"Timeout waiting for task.

even when task are in running state, this error comes up.

Any help on this is highly appreciated.

curl -X GET http://localhost:8080/druid/indexer/v1/supervisor/sohan-ptest101/status

{"id":"sohan-ptest101","generationTime":"2021-09-21T00:45:29.107Z","payload":{"dataSource":"sohan-ptest101","stream":"sohan-ptest101","partitions":1,"replicas":1,"durationSeconds":3600,"activeTasks":[{"id":"index_kafka_sohan-ptest101_bf4a59daf3fd8c9_agjfeomp","startingOffsets":{"0":2},"startTime":"2021-09-21T00:42:14.999Z","remainingSeconds":3405,"type":"ACTIVE","currentOffsets":{"0":2},"lag":{"0":0}}],"publishingTasks":[],"latestOffsets":{"0":2},"minimumLag":{"0":0},"aggregateLag":0,"offsetsLastUpdated":"2021-09-21T00:45:10.740Z","suspended":false,"healthy":false,"state":"UNHEALTHY_TASKS","detailedState":"UNHEALTHY_TASKS","recentErrors":[{"timestamp":"2021-09-21T00:39:39.970Z","exceptionClass":"java.util.concurrent.TimeoutException","message":"Timeout waiting for task.","streamException":false}]}}

Relates to Apache Druid 0.21.0

Can u get the overlord log during this period

Thanks Tijo for response. Below are my configs as well and logs that I got from overload. Coordinator and Overload are running on the same server. The are too many logs currently in overload as I have multiple kafka ingestion running.

Also this set up is running on GCP Kubernetes cluster. And each pod is running one of the services of Druid

**Another issue: ** The middle manager keeps reseting the peons. For example my current worker capacity for Middle manager is 6 and I have 6 kakfa supervisors running and their respective tasks are also running, then middle manager is creating 6 peons and when I push the data, I see that in Data source but after sometime middle manager will reset the peons to 0 or 1 or 2 and I cannot see any data sources. and even I try to push data to kafka, I cannot see that data source and data isn’t getting inserted even though task are in running status.

Configs:

Broker:

  1. Number of Brokers: 1
  2. druid.broker.http.numConnections=20
  3. druid.server.http.numThreads=5
  4. druid.processing.buffer.sizeBytes=268435456
  5. druid.processing.numMergeBuffers=2
  6. druid.processing.numThreads=1
  7. druid.sql.enable=true
    DRUID_XMS=512m
    DRUID_XMX=2048m

Coordinator:

  1. Number of coordinators: 1
  2. druid.service=druid/coordinator
  3. druid.coordinator.startDelay=PT10S
  4. druid.coordinator.period=PT5S
  5. druid.coordinator.asOverlord.enabled=true
  6. druid.coordinator.asOverlord.overlordService=druid/overlord
  7. druid.indexer.queue.startDelay=PT30S
  8. druid.indexer.runner.type=remote
  9. druid.indexer.storage.type=metadata
  10. druid.indexer.runner.pendingTasksRunnerNumThreads=8
  11. druid.coordinator.maxNumConcurrentSubTasks=5
    DRUID_XMS=1g
    DRUID_XMX=2048m

Historicals

  1. Number of Historicals: 1
  2. druid.server.http.numThreads=10
  3. druid.processing.buffer.sizeBytes=536870912
  4. druid.processing.numMergeBuffers=2
  5. druid.processing.numThreads=2
  6. druid.segmentCache.locations=[{“path”:"/druid/data/segments",“maxSize”:40000000000}]
  7. druid.server.maxSize=40000000000
    DRUID_XMS=1500m
    DRUID_XMX=1500m

Middle Managers

  1. Number of Middle managers: 1
  2. druid.worker.capacity=8
  3. druid.indexer.runner.javaOpts=-server -Xmx3g -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
  4. druid.indexer.task.restoreTasksOnRestart=true
  5. druid.indexer.task.baseTaskDir=var/druid/task
  6. druid.server.http.numThreads=8
  7. druid.processing.numThreads=1
  8. druid.indexer.fork.property.druid.processing.numMergeBuffers=2
  9. druid.indexer.fork.property.druid.processing.buffer.sizeBytes=104857600
    10.druid.indexer.fork.property.druid.processing.numThreads=2
    DRUID_XMS=4096m
    DRUID_XMX=4096m

Routers

  1. Number of routers: 1
  2. druid.service=druid/router
  3. druid.processing.numThreads=1
  4. druid.router.http.numConnections=50
  5. druid.router.http.readTimeout=PT5M
  6. druid.router.http.numMaxThreads=100
  7. druid.server.http.numThreads=100
  8. druid.router.defaultBrokerServiceName=druid/broker
  9. druid.router.coordinatorServiceName=druid/coordinator
  10. druid.router.managementProxy.enabled=true
    DRUID_XMS=4096m
    DRUID_XMX=4096m

**Logs from coordinator for the unhealthy task **:


[INFO ] 2021-09-21 12:01:51.226 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - {id='sohan-ptest101', generationTime=2021-09-21T12:01:51.226Z, payload=KafkaSupervisorReportPayload{dataSource='sohan-ptest101', topic='sohan-ptest101', partitions=1, replicas=1, durationSeconds=1800, active=[{id='index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc', startTime=2021-09-21T11:45:50.332Z, remainingSeconds=839}], publishing=[], suspended=false, healthy=false, state=UNHEALTHY_TASKS, detailedState=UNHEALTHY_TASKS, recentErrors=[org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@8036d2e, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@59c46977, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@65126f61]}}

[INFO ] 2021-09-21 12:01:55.241 [TaskQueue-Manager] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest104_bc575040b6486d6_cbcjgkch] because: [task is not in knownTaskIds[[index_kafka_sohan-ptest107_fa319fe9fd94f3e_hahliicj, index_kafka_sohan-ptest105_ff1dd6550d76919_mnabdbbl, index_kafka_sohan-ptest103_5f787d4bc31df34_abigahae, index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc, index_kafka_sohan-ptest104_bc575040b6486d6_jhmfhpji, index_kafka_sohan-ptest102_38ad2e18b50f64f_bknnhmfa, index_kafka_sohan-ptest106_2ac3c21d86c9496_llcmgfdl]]]

[INFO ] 2021-09-21 12:01:55.241 [TaskQueue-Manager] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest102_38ad2e18b50f64f_acoeoglo] because: [task is not in knownTaskIds[[index_kafka_sohan-ptest107_fa319fe9fd94f3e_hahliicj, index_kafka_sohan-ptest105_ff1dd6550d76919_mnabdbbl, index_kafka_sohan-ptest103_5f787d4bc31df34_abigahae, index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc, index_kafka_sohan-ptest104_bc575040b6486d6_jhmfhpji, index_kafka_sohan-ptest102_38ad2e18b50f64f_bknnhmfa, index_kafka_sohan-ptest106_2ac3c21d86c9496_llcmgfdl]]]

[INFO ] 2021-09-21 12:01:55.241 [TaskQueue-Manager] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest105_ff1dd6550d76919_fdbdhhgm] because: [task is not in knownTaskIds[[index_kafka_sohan-ptest107_fa319fe9fd94f3e_hahliicj, index_kafka_sohan-ptest105_ff1dd6550d76919_mnabdbbl, index_kafka_sohan-ptest103_5f787d4bc31df34_abigahae, index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc, index_kafka_sohan-ptest104_bc575040b6486d6_jhmfhpji, index_kafka_sohan-ptest102_38ad2e18b50f64f_bknnhmfa, index_kafka_sohan-ptest106_2ac3c21d86c9496_llcmgfdl]]]

[INFO ] 2021-09-21 12:01:55.241 [TaskQueue-Manager] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest103_5f787d4bc31df34_kcdjokgi] because: [task is not in knownTaskIds[[index_kafka_sohan-ptest107_fa319fe9fd94f3e_hahliicj, index_kafka_sohan-ptest105_ff1dd6550d76919_mnabdbbl, index_kafka_sohan-ptest103_5f787d4bc31df34_abigahae, index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc, index_kafka_sohan-ptest104_bc575040b6486d6_jhmfhpji, index_kafka_sohan-ptest102_38ad2e18b50f64f_bknnhmfa, index_kafka_sohan-ptest106_2ac3c21d86c9496_llcmgfdl]]]

[INFO ] 2021-09-21 12:01:55.241 [TaskQueue-Manager] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest106_2ac3c21d86c9496_jedolhdk] because: [task is not in knownTaskIds[[index_kafka_sohan-ptest107_fa319fe9fd94f3e_hahliicj, index_kafka_sohan-ptest105_ff1dd6550d76919_mnabdbbl, index_kafka_sohan-ptest103_5f787d4bc31df34_abigahae, index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc, index_kafka_sohan-ptest104_bc575040b6486d6_jhmfhpji, index_kafka_sohan-ptest102_38ad2e18b50f64f_bknnhmfa, index_kafka_sohan-ptest106_2ac3c21d86c9496_llcmgfdl]]]

[INFO ] 2021-09-21 12:02:08.018 [IndexTaskClient-sohan-ptest101-0] IndexTaskClient - submitRequest failed for [http://10.60.6.122:8105/druid/worker/v1/chat/index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc/offsets/current], with message [No route to host (Host unreachable)]

[INFO ] 2021-09-21 12:02:08.018 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to LATEST offset of partition sohan-ptest101-0

[INFO ] 2021-09-21 12:02:08.021 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.

[INFO ] 2021-09-21 12:02:19.924 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - [sohan-ptest101] supervisor is running.

[INFO ] 2021-09-21 12:02:19.924 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - {id='sohan-ptest101', generationTime=2021-09-21T12:02:19.924Z, payload=KafkaSupervisorReportPayload{dataSource='sohan-ptest101', topic='sohan-ptest101', partitions=1, replicas=1, durationSeconds=1800, active=[{id='index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc', startTime=2021-09-21T11:45:50.332Z, remainingSeconds=810}], publishing=[], suspended=false, healthy=false, state=UNHEALTHY_TASKS, detailedState=UNHEALTHY_TASKS, recentErrors=[org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@8036d2e, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@59c46977, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@65126f61]}}

[INFO ] 2021-09-21 12:02:21.407 [RemoteTaskRunner-Scheduled-Cleanup--0] RemoteTaskRunner - Failing task[index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc]

[INFO ] 2021-09-21 12:02:21.407 [RemoteTaskRunner-Scheduled-Cleanup--0] TaskQueue - Received FAILED status for task: index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc

[INFO ] 2021-09-21 12:02:21.408 [RemoteTaskRunner-Scheduled-Cleanup--0] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc] because: [notified status change from task]

[INFO ] 2021-09-21 12:02:21.408 [RemoteTaskRunner-Scheduled-Cleanup--0] RemoteTaskRunner - Can't shutdown! No worker running task index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc

[INFO ] 2021-09-21 12:02:21.408 [RemoteTaskRunner-Scheduled-Cleanup--0] TaskLockbox - Removing task[index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc] from activeTasks

[INFO ] 2021-09-21 12:02:21.411 [RemoteTaskRunner-Scheduled-Cleanup--0] MetadataTaskStorage - Updating task index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc to status: TaskStatus{id=index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc, status=FAILED, duration=-1, errorMsg=null}

[INFO ] 2021-09-21 12:02:21.419 [RemoteTaskRunner-Scheduled-Cleanup--0] TaskQueue - Task done: AbstractTask{id='index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc', groupId='index_kafka_sohan-ptest101', taskResource=TaskResource{availabilityGroup='index_kafka_sohan-ptest101_36c94e779daa642', requiredCapacity=1}, dataSource='sohan-ptest101', context={forceTimeChunkLock=true, checkpoints={"0":{"0":8}}, IS_INCREMENTAL_HANDOFF_SUPPORTED=true}}

[INFO ] 2021-09-21 12:02:21.422 [RemoteTaskRunner-Scheduled-Cleanup--0] TaskQueue - Task FAILED: AbstractTask{id='index_kafka_sohan-ptest101_36c94e779daa642_lmbalaoc', groupId='index_kafka_sohan-ptest101', taskResource=TaskResource{availabilityGroup='index_kafka_sohan-ptest101_36c94e779daa642', requiredCapacity=1}, dataSource='sohan-ptest101', context={forceTimeChunkLock=true, checkpoints={"0":{"0":8}}, IS_INCREMENTAL_HANDOFF_SUPPORTED=true}} (-1 run duration)

[INFO ] 2021-09-21 12:02:21.435 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - [sohan-ptest101] supervisor is running.

[WARN ] 2021-09-21 12:02:21.437 [KafkaSupervisor-sohan-ptest101-Worker-0] SeekableStreamSupervisor - Clearing task group [0] information as no valid tasks left the group

[INFO ] 2021-09-21 12:02:21.437 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - Creating new task group [0] for partitions [0]

[INFO ] 2021-09-21 12:02:21.439 [KafkaSupervisor-sohan-ptest101] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to LATEST offset of partition sohan-ptest101-0

[INFO ] 2021-09-21 12:02:21.440 [KafkaSupervisor-sohan-ptest101] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.

[INFO ] 2021-09-21 12:02:21.441 [KafkaSupervisor-sohan-ptest101] KafkaConsumer - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to offset 9 for partition sohan-ptest101-0

[INFO ] 2021-09-21 12:02:21.442 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - Number of tasks [0] does not match configured numReplicas [1] in task group [0], creating more tasks

[INFO ] 2021-09-21 12:02:21.444 [KafkaSupervisor-sohan-ptest101] MetadataTaskStorage - Inserting task index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn with status: TaskStatus{id=index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn, status=RUNNING, duration=-1, errorMsg=null}

[INFO ] 2021-09-21 12:02:21.453 [KafkaSupervisor-sohan-ptest101] TaskLockbox - Adding task[index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn] to activeTasks

[INFO ] 2021-09-21 12:02:21.454 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - {id='sohan-ptest101', generationTime=2021-09-21T12:02:21.454Z, payload=KafkaSupervisorReportPayload{dataSource='sohan-ptest101', topic='sohan-ptest101', partitions=1, replicas=1, durationSeconds=1800, active=[], publishing=[], suspended=false, healthy=false, state=UNHEALTHY_TASKS, detailedState=UNHEALTHY_TASKS, recentErrors=[org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@8036d2e, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@59c46977, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@65126f61]}}

[INFO ] 2021-09-21 12:02:21.455 [TaskQueue-Manager] TaskQueue - Asking taskRunner to run: index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn

[INFO ] 2021-09-21 12:02:21.455 [TaskQueue-Manager] RemoteTaskRunner - Added pending task index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn

[INFO ] 2021-09-21 12:02:21.456 [rtr-pending-tasks-runner-2] RemoteTaskRunner - Coordinator asking Worker[10.60.11.127:8088] to add task[index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn]

[INFO ] 2021-09-21 12:02:21.457 [TaskQueue-Manager] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest106_2ac3c21d86c9496_jedolhdk] because: [task is not in knownTaskIds[[index_kafka_sohan-ptest107_fa319fe9fd94f3e_hahliicj, index_kafka_sohan-ptest105_ff1dd6550d76919_mnabdbbl, index_kafka_sohan-ptest103_5f787d4bc31df34_abigahae, index_kafka_sohan-ptest104_bc575040b6486d6_jhmfhpji, index_kafka_sohan-ptest102_38ad2e18b50f64f_bknnhmfa, index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn, index_kafka_sohan-ptest106_2ac3c21d86c9496_llcmgfdl]]]

[INFO ] 2021-09-21 12:02:21.462 [rtr-pending-tasks-runner-2] RemoteTaskRunner - Task index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn switched from pending to running (on [10.60.11.127:8088])

[INFO ] 2021-09-21 12:02:21.491 [Curator-PathChildrenCache-1] RemoteTaskRunner - Worker[10.60.11.127:8088] wrote RUNNING status for task [index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn] on [TaskLocation{host='null', port=-1, tlsPort=-1}]

[INFO ] 2021-09-21 12:02:21.516 [Curator-PathChildrenCache-1] RemoteTaskRunner - Worker[10.60.11.127:8088] wrote RUNNING status for task [index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn] on [TaskLocation{host='10.60.11.127', port=8105, tlsPort=-1}]

[INFO ] 2021-09-21 12:02:34.914 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to LATEST offset of partition sohan-ptest101-0

[INFO ] 2021-09-21 12:02:34.917 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.

[INFO ] 2021-09-21 12:02:53.395 [Curator-PathChildrenCache-0] RemoteTaskRunner - [10.60.11.127:8088]: Found [index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn] running

[INFO ] 2021-09-21 12:03:04.915 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to LATEST offset of partition sohan-ptest101-0

[INFO ] 2021-09-21 12:03:04.917 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.

[INFO ] 2021-09-21 12:03:34.914 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to LATEST offset of partition sohan-ptest101-0

[INFO ] 2021-09-21 12:03:34.916 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.

[INFO ] 2021-09-21 12:04:04.914 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to LATEST offset of partition sohan-ptest101-0

[INFO ] 2021-09-21 12:04:04.916 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.

[INFO ] 2021-09-21 12:04:34.914 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to LATEST offset of partition sohan-ptest101-0

[INFO ] 2021-09-21 12:04:34.916 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.

[WARN ] 2021-09-21 12:04:54.299 [IndexTaskClient-sohan-ptest101-0] IndexTaskClient - Retries exhausted for [http://10.60.11.127:8105/druid/worker/v1/chat/index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn/status], last exception:

[INFO ] 2021-09-21 12:04:54.300 [KafkaSupervisor-sohan-ptest101] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn] because: [Task [index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn] failed to return status, killing task]

[INFO ] 2021-09-21 12:04:54.300 [KafkaSupervisor-sohan-ptest101] RemoteTaskRunner - Can't shutdown! No worker running task index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn

[INFO ] 2021-09-21 12:04:54.300 [KafkaSupervisor-sohan-ptest101] TaskLockbox - Removing task[index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn] from activeTasks

[INFO ] 2021-09-21 12:04:54.303 [KafkaSupervisor-sohan-ptest101] MetadataTaskStorage - Updating task index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn to status: TaskStatus{id=index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn, status=FAILED, duration=-1, errorMsg=null}

[INFO ] 2021-09-21 12:04:54.312 [KafkaSupervisor-sohan-ptest101] TaskQueue - Task done: AbstractTask{id='index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn', groupId='index_kafka_sohan-ptest101', taskResource=TaskResource{availabilityGroup='index_kafka_sohan-ptest101_587a5e8e634cb25', requiredCapacity=1}, dataSource='sohan-ptest101', context={forceTimeChunkLock=true, checkpoints={"0":{"0":9}}, IS_INCREMENTAL_HANDOFF_SUPPORTED=true}}

[INFO ] 2021-09-21 12:04:54.312 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - [sohan-ptest101] supervisor is running.

[INFO ] 2021-09-21 12:04:54.313 [TaskQueue-Manager] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn] because: [task is not in knownTaskIds[[index_kafka_sohan-ptest105_ff1dd6550d76919_mnabdbbl, index_kafka_sohan-ptest103_5f787d4bc31df34_abigahae, index_kafka_sohan-ptest104_bc575040b6486d6_jhmfhpji, index_kafka_sohan-ptest102_38ad2e18b50f64f_bknnhmfa, index_kafka_sohan-ptest107_fa319fe9fd94f3e_fbjbbbjm, index_kafka_sohan-ptest106_2ac3c21d86c9496_llcmgfdl]]]

[INFO ] 2021-09-21 12:04:54.313 [TaskQueue-Manager] RemoteTaskRunner - Can't shutdown! No worker running task index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn

[WARN ] 2021-09-21 12:04:54.314 [KafkaSupervisor-sohan-ptest101-Worker-0] SeekableStreamSupervisor - Clearing task group [0] information as no valid tasks left the group

[INFO ] 2021-09-21 12:04:54.314 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - Creating new task group [0] for partitions [0]

[INFO ] 2021-09-21 12:04:54.316 [KafkaSupervisor-sohan-ptest101] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to LATEST offset of partition sohan-ptest101-0

[INFO ] 2021-09-21 12:04:54.317 [KafkaSupervisor-sohan-ptest101] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.

[INFO ] 2021-09-21 12:04:54.317 [KafkaSupervisor-sohan-ptest101] KafkaConsumer - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to offset 9 for partition sohan-ptest101-0

[INFO ] 2021-09-21 12:04:54.317 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - Number of tasks [0] does not match configured numReplicas [1] in task group [0], creating more tasks

[INFO ] 2021-09-21 12:04:54.319 [KafkaSupervisor-sohan-ptest101] MetadataTaskStorage - Inserting task index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih with status: TaskStatus{id=index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih, status=RUNNING, duration=-1, errorMsg=null}

[INFO ] 2021-09-21 12:04:54.328 [KafkaSupervisor-sohan-ptest101] TaskLockbox - Adding task[index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih] to activeTasks

[INFO ] 2021-09-21 12:04:54.328 [TaskQueue-Manager] TaskQueue - Asking taskRunner to run: index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih

[INFO ] 2021-09-21 12:04:54.328 [TaskQueue-Manager] RemoteTaskRunner - Added pending task index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih

[INFO ] 2021-09-21 12:04:54.328 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - {id='sohan-ptest101', generationTime=2021-09-21T12:04:54.328Z, payload=KafkaSupervisorReportPayload{dataSource='sohan-ptest101', topic='sohan-ptest101', partitions=1, replicas=1, durationSeconds=1800, active=[], publishing=[], suspended=false, healthy=false, state=UNHEALTHY_TASKS, detailedState=UNHEALTHY_TASKS, recentErrors=[org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@8036d2e, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@59c46977, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@65126f61]}}

[INFO ] 2021-09-21 12:04:54.328 [TaskQueue-Manager] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn] because: [task is not in knownTaskIds[[index_kafka_sohan-ptest105_ff1dd6550d76919_mnabdbbl, index_kafka_sohan-ptest103_5f787d4bc31df34_abigahae, index_kafka_sohan-ptest104_bc575040b6486d6_jhmfhpji, index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih, index_kafka_sohan-ptest102_38ad2e18b50f64f_bknnhmfa, index_kafka_sohan-ptest107_fa319fe9fd94f3e_fbjbbbjm, index_kafka_sohan-ptest106_2ac3c21d86c9496_llcmgfdl]]]

[INFO ] 2021-09-21 12:04:54.328 [TaskQueue-Manager] RemoteTaskRunner - Can't shutdown! No worker running task index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn

[INFO ] 2021-09-21 12:04:54.329 [rtr-pending-tasks-runner-5] RemoteTaskRunner - Coordinator asking Worker[10.60.79.231:8088] to add task[index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih]

[WARN ] 2021-09-21 12:04:54.343 [KafkaSupervisor-sohan-ptest101-Worker-0] SeekableStreamSupervisor - Ignoring task [index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih], as probably it is not started running yet

[INFO ] 2021-09-21 12:04:54.350 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - [sohan-ptest101] supervisor is running.

[INFO ] 2021-09-21 12:04:54.351 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - {id='sohan-ptest101', generationTime=2021-09-21T12:04:54.351Z, payload=KafkaSupervisorReportPayload{dataSource='sohan-ptest101', topic='sohan-ptest101', partitions=1, replicas=1, durationSeconds=1800, active=[{id='index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih', startTime=null, remainingSeconds=null}], publishing=[], suspended=false, healthy=false, state=UNHEALTHY_TASKS, detailedState=UNHEALTHY_TASKS, recentErrors=[org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@8036d2e, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@59c46977, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@65126f61]}}

[INFO ] 2021-09-21 12:04:54.372 [rtr-pending-tasks-runner-5] RemoteTaskRunner - Task index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih switched from pending to running (on [10.60.79.231:8088])

[INFO ] 2021-09-21 12:04:55.231 [Curator-PathChildrenCache-1] RemoteTaskRunner - Worker[10.60.79.231:8088] wrote RUNNING status for task [index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih] on [TaskLocation{host='null', port=-1, tlsPort=-1}]

[INFO ] 2021-09-21 12:04:55.237 [TaskQueue-Manager] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn] because: [task is not in knownTaskIds[[index_kafka_sohan-ptest105_ff1dd6550d76919_mnabdbbl, index_kafka_sohan-ptest103_5f787d4bc31df34_abigahae, index_kafka_sohan-ptest104_bc575040b6486d6_jhmfhpji, index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih, index_kafka_sohan-ptest102_38ad2e18b50f64f_bknnhmfa, index_kafka_sohan-ptest107_fa319fe9fd94f3e_fbjbbbjm, index_kafka_sohan-ptest106_2ac3c21d86c9496_llcmgfdl]]]

[INFO ] 2021-09-21 12:04:55.238 [TaskQueue-Manager] RemoteTaskRunner - Can't shutdown! No worker running task index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn

[INFO ] 2021-09-21 12:04:55.291 [Curator-PathChildrenCache-1] RemoteTaskRunner - Worker[10.60.79.231:8088] wrote RUNNING status for task [index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih] on [TaskLocation{host='10.60.79.231', port=8100, tlsPort=-1}]

[INFO ] 2021-09-21 12:05:13.625 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - [sohan-ptest101] supervisor is running.

[INFO ] 2021-09-21 12:05:13.625 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - {id='sohan-ptest101', generationTime=2021-09-21T12:05:13.625Z, payload=KafkaSupervisorReportPayload{dataSource='sohan-ptest101', topic='sohan-ptest101', partitions=1, replicas=1, durationSeconds=1800, active=[{id='index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih', startTime=2021-09-21T12:05:04.520Z, remainingSeconds=1790}], publishing=[], suspended=false, healthy=false, state=UNHEALTHY_TASKS, detailedState=UNHEALTHY_TASKS, recentErrors=[org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@8036d2e, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@59c46977, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@65126f61]}}

[INFO ] 2021-09-21 12:05:13.641 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to LATEST offset of partition sohan-ptest101-0

[INFO ] 2021-09-21 12:05:13.643 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.

[INFO ] 2021-09-21 12:05:19.923 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - [sohan-ptest101] supervisor is running.

[INFO ] 2021-09-21 12:05:19.924 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - {id='sohan-ptest101', generationTime=2021-09-21T12:05:19.924Z, payload=KafkaSupervisorReportPayload{dataSource='sohan-ptest101', topic='sohan-ptest101', partitions=1, replicas=1, durationSeconds=1800, active=[{id='index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih', startTime=2021-09-21T12:05:04.520Z, remainingSeconds=1784}], publishing=[], suspended=false, healthy=false, state=UNHEALTHY_TASKS, detailedState=UNHEALTHY_TASKS, recentErrors=[org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@8036d2e, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@59c46977, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@65126f61]}}

[INFO ] 2021-09-21 12:05:34.925 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.

[INFO ] 2021-09-21 12:05:49.923 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - [sohan-ptest101] supervisor is running.

[INFO ] 2021-09-21 12:05:49.923 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - {id='sohan-ptest101', generationTime=2021-09-21T12:05:49.923Z, payload=KafkaSupervisorReportPayload{dataSource='sohan-ptest101', topic='sohan-ptest101', partitions=1, replicas=1, durationSeconds=1800, active=[{id='index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih', startTime=2021-09-21T12:05:04.520Z, remainingSeconds=1754}], publishing=[], suspended=false, healthy=false, state=UNHEALTHY_TASKS, detailedState=UNHEALTHY_TASKS, recentErrors=[org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@8036d2e, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@59c46977, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@65126f61]}}

[INFO ] 2021-09-21 12:05:55.238 [TaskQueue-Manager] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn] because: [task is not in knownTaskIds[[index_kafka_sohan-ptest105_ff1dd6550d76919_mnabdbbl, index_kafka_sohan-ptest103_5f787d4bc31df34_abigahae, index_kafka_sohan-ptest104_bc575040b6486d6_jhmfhpji, index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih, index_kafka_sohan-ptest102_38ad2e18b50f64f_bknnhmfa, index_kafka_sohan-ptest107_fa319fe9fd94f3e_fbjbbbjm, index_kafka_sohan-ptest106_2ac3c21d86c9496_llcmgfdl]]]

[INFO ] 2021-09-21 12:05:55.238 [TaskQueue-Manager] RemoteTaskRunner - Can't shutdown! No worker running task index_kafka_sohan-ptest101_587a5e8e634cb25_ccjamidn

[INFO ] 2021-09-21 12:06:04.923 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to LATEST offset of partition sohan-ptest101-0

[INFO ] 2021-09-21 12:06:04.924 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.

[INFO ] 2021-09-21 12:06:19.922 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - [sohan-ptest101] supervisor is running.

[INFO ] 2021-09-21 12:06:19.923 [KafkaSupervisor-sohan-ptest101] SeekableStreamSupervisor - {id='sohan-ptest101', generationTime=2021-09-21T12:06:19.923Z, payload=KafkaSupervisorReportPayload{dataSource='sohan-ptest101', topic='sohan-ptest101', partitions=1, replicas=1, durationSeconds=1800, active=[{id='index_kafka_sohan-ptest101_587a5e8e634cb25_epobljih', startTime=2021-09-21T12:05:04.520Z, remainingSeconds=1724}], publishing=[], suspended=false, healthy=false, state=UNHEALTHY_TASKS, detailedState=UNHEALTHY_TASKS, recentErrors=[org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@8036d2e, Preformatted textorg.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@59c46977, org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorStateManager$SeekableStreamExceptionEvent@65126f61]}}

[INFO ] 2021-09-21 12:06:34.925 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Seeking to LATEST offset of partition sohan-ptest101-0

[INFO ] 2021-09-21 12:06:34.927 [KafkaSupervisor-sohan-ptest101-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-jlgpmfda-17, groupId=kafka-supervisor-jlgpmfda] Resetting offset for partition sohan-ptest101-0 to position FetchPosition{offset=9, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.```

I also see something like this in the log. I have masked the IP address. below…

[INFO ] 2021-09-21 02:56:37.395 [KafkaSupervisor-sohan-ptest107-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-idpiemmi-16, groupId=kafka-supervisor-idpiemmi] Seeking to LATEST offset of partition sohan-ptest107-0
[INFO ] 2021-09-21 02:56:37.396 [KafkaSupervisor-sohan-ptest107-Reporting-0] SubscriptionState - [Consumer clientId=consumer-kafka-supervisor-idpiemmi-16, groupId=kafka-supervisor-idpiemmi] Resetting offset for partition sohan-ptest107-0 to position FetchPosition{offset=2, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[kafka-0.kafka-headless.kafka.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}.
[WARN ] 2021-09-21 02:56:40.467 [IndexTaskClient-sohan-ptest105-0] IndexTaskClient - Retries exhausted for [http://10.X.X.X:8105/druid/worker/v1/chat/index_kafka_sohan-ptest105_f2a03a8bcef5641_polnmbbp/status], last exception:
java.net.NoRouteToHostException: No route to host (Host unreachable)
	at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_275]
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_275]
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_275]
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_275]
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_275]
	at java.net.Socket.connect(Socket.java:607) ~[?:1.8.0_275]
	at java.net.Socket.connect(Socket.java:556) ~[?:1.8.0_275]
	at java.net.Socket.<init>(Socket.java:452) ~[?:1.8.0_275]
	at java.net.Socket.<init>(Socket.java:229) ~[?:1.8.0_275]
	at org.apache.druid.indexing.common.IndexTaskClient.checkConnection(IndexTaskClient.java:209) ~[druid-indexing-service-0.21.1.jar:0.21.1]
	at org.apache.druid.indexing.common.IndexTaskClient.submitRequest(IndexTaskClient.java:348) ~[druid-indexing-service-0.21.1.jar:0.21.1]
	at org.apache.druid.indexing.common.IndexTaskClient.submitRequestWithEmptyContent(IndexTaskClient.java:220) ~[druid-indexing-service-0.21.1.jar:0.21.1]
	at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient.getStatus(SeekableStreamIndexTaskClient.java:172) ~[druid-indexing-service-0.21.1.jar:0.21.1]
	at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient.lambda$getStatusAsync$9(SeekableStreamIndexTaskClient.java:373) ~[druid-indexing-service-0.21.1.jar:0.21.1]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_275]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_275]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_275]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
[INFO ] 2021-09-21 02:56:40.468 [KafkaSupervisor-sohan-ptest105] RemoteTaskRunner - Shutdown [index_kafka_sohan-ptest105_f2a03a8bcef5641_polnmbbp] because: [Task [index_kafka_sohan-ptest105_f2a03a8bcef5641_polnmbbp] failed to return status, killing task]
[INFO ] 2021-09-21 02:56:40.468 [KafkaSupervisor-sohan-ptest105] RemoteTaskRunner - Can't shutdown! No worker running task index_kafka_sohan-ptest105_f2a03a8bcef5641_polnmbbp
[INFO ] 2021-09-21 02:56:40.468 [KafkaSupervisor-sohan-ptest105] TaskLockbox - Removing task[index_kafka_sohan-ptest105_f2a03a8bcef5641_polnmbbp] from activeTasks
[INFO ] 2021-09-21 02:56:40.473 [KafkaSupervisor-sohan-ptest105] MetadataTaskStorage - Updating task index_kafka_sohan-ptest105_f2a03a8bcef5641_polnmbbp to status: TaskStatus{id=index_kafka_sohan-ptest105_f2a03a8bcef5641_polnmbbp, status=FAILED, duration=-1, errorMsg=null}```

Hi Tijo,

Can you please let me know if you need any more info? I am still facing this issue…

Hi Sohansamant,
I feel you have a network issue. Seems like the connection from the druid cluster to Kafka cluster is not stable.
Try running Kafka-console-consumer after enabling the logging from overlord and middle manager nodes. This can identify if there is any connectivity issues.