Kafka Indexing Service consume from specific offsets

We are using Kafka Indexing Service to ingest data.

Now we want to transfer the data from one Druid cluster to another Druid cluster(with different metadata db,deep storage)

The steps are listed here:

  1. shutdown the kafka supervisor

  2. copy the segements to the new druid

  3. update the metadata using InsertSegment tools

  4. start the kafak supervisor on the new Druid

The problem is, the two druid is using the same Kafka cluster, How can I know which offsets are consumed, and how to set the offsets in the new kafka supervisor?

I believe it is saved in the metadata db, but I only can find the start offset in the table druid_tasks like this:

“startPartitions” : {

“partitionOffsetMap” : {

“11” : 3,

“2” : 3,

“5” : 3,

“8” : 3

},

“topic” : “topic1”

Once Kafka Supervisor is started on new cluster pointing to same metadata db then the new Kafka tasks will start consuming from the offsets that the last set of tasks ended on.

Thanks for your reply.

Unfortunatley, our new druid is using new metadata db, because the deep storage is changed from nfs to hdfs.

I dont konw if this method is Ok for me.

Anyway, my method is,

Start KIS job to consume the kafka message on the new druid cluster, after the job catches the latest kafka offset, i begin to transfer the segements from nfs to hdfs.

After all data is ready in the new cluster, I stop the old KIS job and then the old druid cluster.

It works fine.

在 2017年12月20日星期三 UTC+8下午11:31:22,Parag Jain写道: