KafkaSupervisor task failed with with completion timeout elapsed

Hi All,

I am using

  1. Durid version is: druid-0.12.0

  2. Kafaka version: kafka_2.11-1.1.0

  3. druid-kafka-indexing-service extension for ingestion of kafka steam into druid.

2018-04-25 11:47:28,742 ERROR i.d.i.k.s.KafkaSupervisor [KafkaSupervisor-test3] No task in [[index_kafka_test3_970f510e2628df6_phpjaoje]] succeeded before the completion timeout elapsed [PT1800S]!: {class=io.druid.indexing.kafka.supervisor.KafkaSupervisor}

I have found few of the task successfully completed though majorities are failed.

I have increased “taskDuration” in my supervisor spec and increased “druid.worker.capacity” in middle manager properties.

can you please let me know what else configuration tuning i can do for it.

Thanks advance for your help.

Best Wishes,

Sohel

same problem.Have you solved it?

How much data in bytes is being published by the task? Do the logs for the failed tasks show them trying to publish segments?

If given the data size it’s not unreasonable for the segment publish to take 30 minutes or more, then you can try increasing completionTimeout in the IOConfig in the Kafka supervisor spec from the default 1800S.

If that time seems unreasonable for the amount of data you’re working with, it may be good idea to try to see why the publish is taking so long.

Thanks,

Jon

Hi Jon,

Thanks for your reply. My data in bytes are very small only few kilo bytes and logs are showing every 30s tasks failed task was publishing before timeout happened.

2018-05-09 13:18:28,742 INFO i.d.i.k.s.KafkaSupervisor [KafkaSupervisor-test3] {id=‘test3’, generationTime=2018-05-09T13:18:28.742Z, payload={dataSource=‘test3’, topic=‘test3’, partitions=1, replicas=1, durationSeconds=600, active=[{id=‘index_kafka_test3_04c8dd52c83a25b_hbicpkfi’, startTime=null, remainingSeconds=null}], publishing=[{id=‘index_kafka_test3_04c8dd52c83a25b_aefkalhc’, startTime=2018-05-09T12:39:30.055Z, remainingSeconds=62}]}}

2018-05-09 13:18:58,741 INFO i.d.i.k.s.KafkaSupervisor [KafkaSupervisor-test3] {id=‘test3’, generationTime=2018-05-09T13:18:58.741Z, payload={dataSource=‘test3’, topic=‘test3’, partitions=1, replicas=1, durationSeconds=600, active=[{id=‘index_kafka_test3_04c8dd52c83a25b_hbicpkfi’, startTime=null, remainingSeconds=null}], publishing=[{id=‘index_kafka_test3_04c8dd52c83a25b_aefkalhc’, startTime=2018-05-09T12:39:30.055Z, remainingSeconds=32}]}}

2018-05-09 13:19:28,741 INFO i.d.i.k.s.KafkaSupervisor [KafkaSupervisor-test3] {id=‘test3’, generationTime=2018-05-09T13:19:28.741Z, payload={dataSource=‘test3’, topic=‘test3’, partitions=1, replicas=1, durationSeconds=600, active=[{id=‘index_kafka_test3_04c8dd52c83a25b_hbicpkfi’, startTime=null, remainingSeconds=null}], publishing=[{id=‘index_kafka_test3_04c8dd52c83a25b_aefkalhc’, startTime=2018-05-09T12:39:30.055Z, remainingSeconds=2}]}}

2018-05-09 13:19:58,739 ERROR i.d.i.k.s.KafkaSupervisor [KafkaSupervisor-test3] No task in [[index_kafka_test3_04c8dd52c83a25b_aefkalhc]] succeeded before the completion timeout elapsed [PT1800S]!: {class=io.druid.indexing.kafka.supervisor.KafkaSupervisor}

Regards,

Sohel

I run into same issue. Any solution for this yet?