A question about intermediatePersistPeriod in Kafka Indexing service and few other questions

  • In kafka based indexing does this property (defaults to 10m ) imply a local persist of segments and the offsets every 10 minute ? And if that node was to be restarted the ACTIVE phase will take the nearest persist into consideration ?

  • In real time querying do local persisted segments on peons ( if I am understanding the presence of these properties correctly ) have a memory based replica or are they pull into memory on request ?

  • Apart from lowering the timeDuration, are there any pointers to making the queries on Peon faster ( apart from reducing partitions which in our cases not an option)

Any other pointers to making real time querying leaner and faster will be appreciated.

Is there any feedback. We had an issue where we saw this when we resubmitted a job ( we remove a single dimension )

Note that our strategy is equalDistribution.

When we resubmit the ACTIVE tasks immediately switch to PUBLISH and new ACTIVE are almost immediately introduced. Note that our strategy is equalDistribution and thus we see the new ACTIVE tasks not taking off from the PUBLISHED lastOffset/ We resume that the offsets are persisted to remote lookup at the end of the PUBLISH task and that makes sense as that marks a subset of kafka stream as complete from start to finish. Tasks will persist there offsets to the local FS every 10 minutes but without affinity the new ACTIVE tasks cannot retrieve the local last offset of the last ACTIVE job and thus use the stale offsets in the remote lookup.

This leads to a exception when the new ACTIVE ( the ones that got redistributed to a new node loosing affinity ) tasks are ready to PUBLISH.

io.druid.java.util.common.ISE: Transaction failure publishing segments, aborting

Note druid recovers by restarting the job to use the remote offsets.

Is this what we should expect and if yes does “fillCapacityWithAffinity” potentially solve the issue ?