Kafka Indexing Service

Hi all,
I’m working on a project with Apache Kafka and Druid, and I’m using Apache Kafka Supervisor in order to load records from Kafka to Druid. Every task duration is 10 minutes.

If one or more kafka indexing task fail for some reason( for example OutOfMemory), how can I recover the failed task ? It seems that all the records read from Kafka topic by the failed task, but not already published, are lost.

Is there a way to recover kafka messages not indexed by druid ?


I think this completely depends on how your topic / consumer are configured. Is the topic compacted? What is the retention time?

I am not a Kafka expert, but setting


in your consumer config (ioConfig in Druid) might work. Again, this is dependent on your setup…