I’m looking for a way to set a timeout duration on Druid batch ingestion tasks. There is a configuration on the overlord: druid.indexer.runner.taskAssignmentTimeout, but this only terminates the task if it cannot be run on the middle manager successfully after the specified duration. I’m looking at a more general scenario whereby the batch ingestion task can be terminated automatically when xx time passes since it starts running.
The reason I’m asking is because I encountered a situation whereby Druid kept trying to connect to the EMR cluster indefinitely until the task was manually killed, and in this case the task was started successfully on the middle manager, except that the errors did not cause it to terminate. Hence, it would be good to have a more general solution to kill the task after xx time, as there could be other types of errors which could lead to a similar situation.