Kafka indexing service - questions

Hi,
We started using Kafka indexing service because we have been told it can partition our segments
and ensure out segments won’t be above X Mega.
I have some questions about the configuration.

  1. Which property will ensure our segment size?
  2. How does the Kafka indexing service partition the segment? what is the recommended partition size?
  3. How can we stop the task of indexing-service, modify it, and continue reading events where we stop.
  4. What is the meaning of the property taskDuration, if the duration is 1 hour how we can ensure the segment will be the size mentioned in the property maxRowsPerSegment

Thx,
Alon

Hi Alon,

Most of your questions can be answered directly through the druid docs

  1. Which property will ensure our segment size?

Refer to KafkaSuperviorTuningConfig header for detail - http://druid.io/docs/latest/development/extensions-core/kafka-ingestion

  1. How does the Kafka indexing service partition the segment? what is the recommended partition size?

Kafka indexing service creates partitions based on the the segment size specified and number of partitions in the kafka topic.

  1. How can we stop the task of indexing-service, modify it, and continue reading events where we stop.
    Refer to Operations header - http://druid.io/docs/latest/development/extensions-core/kafka-ingestion

  2. What is the meaning of the property taskDuration, if the duration is 1 hour how we can ensure the segment will be the size mentioned in the property maxRowsPerSegment

taskDuration means for how long your ingestion task is going to run irrespective of the number of segments created. Once that time is lapsed, the supervisor would gracefully shut that process down, publish the segments and start a new process.