What to set KafkaIndexingService taskDuration to?

Hey all,

With the Kafka Indexing Service now supporting incremental handoffs how are folks configuring the taskDuration of the kafka indexing service?

If the service was ingesting hourly data into hourly segments, would there be any downside to setting the taskDuration to one or more days? Or even just have the task run forever?

The theorised advantage being when the task runs forever the service doesn’t abruptly write smaller segments.

Best regards,

Dylan

I try to set taskDuration to something like 3-4X of the segment granularity (given segment granularity is less than 1 day or so) mostly to allow log rotation of task logs. Currently, task logs are stored locally until the task finishes so the size should be kept reasonable.

So, ideally when task logs can also be uploaded incrementally to some other long term store then tasks should run forever and supervisor will just create new tasks when existing ones fail.

  • Parag

This is my opinion, not sure what others are setting the taskDuration to, may be others can pitch in as well.

  • Parag