Kafka Indexing Service + Kafka coordinator issue = no task in middle manager

Hi all,

My name is Dan and this is my first post here, so please bare with me. I wanted to share with you an issue that we encountered with Kafka indexing service and maybe you’ll add more support for this particular case in a further version.

In short, we are using Kafka Indexing Service to ingest data from a Kafka topic into Druid - the problem was that we didn’t had any task spawn by the middle manager and the logs didn’t show anything unusual :
Enter 2017-03-15T10:14:38,778 INFO [KafkaSupervisor-xxx] io.druid.indexing.overlord.RemoteTaskRunner - Registered listener [KafkaSupervisor-xxx]
2017-03-15T10:14:39,035 INFO [KafkaSupervisor-xxx] io.druid.indexing.kafka.supervisor.KafkaSupervisor - New partition [0] discovered for topic [yyy], added to task group [0]
2017-03-15T10:14:39,036 INFO [KafkaSupervisor-xx] io.druid.indexing.kafka.supervisor.KafkaSupervisor - Creating new task group [0] for partitions [0]
2017-03-15T10:15:27,043 INFO [TaskQueue-StorageSync] io.druid.indexing.overlord.TaskQueue - Synced 0 tasks from storage (0 tasks added, 0 tasks removed).code here…

``

After debugging, we pinpoint the issue on the Kafka side: we had previously some test topics created by brokers which weren’t available anymore (AWS instances shut down)
We are on the version 0.9.2 and the process get stuck on KafkaSupervisor.getOffsetFromKafkaForPartition(), line 1380:
return consumer.position(topicPartition);

``

Going through the Kafka code, in AbstractCoordinator.ensureCoordinatorKnown() , I can see an exception swallowed - “The group coordinator is not available.” and apparently loop forever.

After cleaning up our Zookeeper, everything get back to normal, but it would be helpfull if more we get more information in logs

Thank you!
Dan

Hi Dan, thanks for reporting it, I agree the logs can be more informative.
Could you please file a github issue for the same ?

We would be happy to accept a PR for improving this.