Druid ingestion Incremental persist failed

Druid ingestion runs fine when maxNumConcurrentSubTasks is not specified, meaning single-threaded mode. But fails with error below when maxNumConcurrentSubTasks=2. Could someone please advise how to resolve?

Druid version used is 0.20.0
All the components are dockerized and running on local machine.

2021-10-08T00:51:59,272 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Exception while running task[AbstractTask{id=‘single_phase_sub_task_master_order_event_ckonplme_2021-10-08T00:51:28.490Z’, groupId=‘i
ndex_parallel_master_order_event_inchjjih_2021-10-08T00:45:27.591Z’, taskResource=TaskResource{availabilityGroup=‘single_phase_sub_task_master_order_event_ckonplme_2021-10-08T00:51:28.490Z’, requiredCapacity=1}, dataSource=‘master_order_event’, context={
forceTimeChunkLock=true}}]
org.apache.druid.java.util.common.ISE: Failed to shutdown executors during close()
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.closeNow(AppenderatorImpl.java:925) ~[druid-server-0.20.0.jar:0.20.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:396) ~[druid-indexing-service-0.20.0.jar:0.20.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:193) ~[druid-indexing-service-0.20.0.jar:0.20.0]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:140) ~[druid-indexing-service-0.20.0.jar:0.20.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:451) [druid-indexing-service-0.20.0.jar:0.20.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:423) [druid-indexing-service-0.20.0.jar:0.20.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_265]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
2021-10-08T00:51:59,354 ERROR [[single_phase_sub_task_master_order_event_ckonplme_2021-10-08T00:51:28.490Z]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Incremental persist failed: {class=org.apache.druid.segmen
t.realtime.appenderator.AppenderatorImpl, segment=master_order_event_2021-10-01T00:00:00.000Z_2021-10-02T00:00:00.000Z_2021-10-08T00:46:05.379Z_5, dataSource=master_order_event, count=0}
2021-10-08T00:51:59,354 INFO [task-runner-0-priority-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
“id” : “single_phase_sub_task_master_order_event_ckonplme_2021-10-08T00:51:28.490Z”,
“status” : “FAILED”,
“duration” : 15282,
“errorMsg” : “org.apache.druid.java.util.common.ISE: Failed to shutdown executors during close()”,
“location” : {
“host” : null,
“port” : -1,
“tlsPort” : -1
}
}

I’ve not seen this error Failed to shutdown executors during close before, but can I suggest maybe that you ensure that you have enough workers available, and also that the overlord is able to communicate on the necessary ports with all the workers that get started? You can see the ports for your workers in the Services tab.

I also see Incremental persist failed here – are those Middle Manager workers able to write to Deep Storage OK?

I see segments folder populated so I think workers are able to write to deep storage ok, which in this case is just local storage. Also, parallel run seems to work for smaller workloads, so ports may not be an issue. But it seems to fail whenever the second sub task launches. Also, I see these WARN messages…

single_phase_sub_task_master_order_event_iobknmei_2021-10-12T17:44:52.853Z.log:2021-10-12T17:46:09,428 WARN [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Ingestion was throttled for [5,364] millis because persists were pending.
single_phase_sub_task_master_order_event_iobknmei_2021-10-12T17:44:52.853Z.log:2021-10-12T17:47:02,596 WARN [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Ingestion was throttled for [3,649] millis because persists were pending.
single_phase_sub_task_master_order_event_iobknmei_2021-10-12T17:44:52.853Z.log:2021-10-12T17:47:29,388 WARN [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Ingestion was throttled for [3,182] millis because persists were pending.
single_phase_sub_task_master_order_event_iobknmei_2021-10-12T17:44:52.853Z.log:2021-10-12T17:47:43,689 WARN [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Ingestion was throttled for [1,731] millis because persists were pending.
single_phase_sub_task_master_order_event_iobknmei_2021-10-12T17:44:52.853Z.log:2021-10-12T17:48:12,184 WARN [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Ingestion was throttled for [1,586] millis because persists were pending.
single_phase_sub_task_master_order_event_iobknmei_2021-10-12T17:44:52.853Z.log:2021-10-12T17:49:18,432 WARN [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Ingestion was throttled for [10,514] millis because persists were pending.