Error in opening zip file during ingestion

I’m getting the error: “error in opening zip file” during the partial_index_generic_merge phase of parallel indexing. I’m reading parquet files from azure, and the first two stages run fine, some of the partial_index_generic_merge tasks run successfully, but then some always fail with this error. Does anyone know what’s going on? Thanks!

Logs ``` 2022-06-14T20:45:26,572 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Exception while running task[AbstractTask{id='partial_index_generic_merge_metrics_maestro_v4_staging_lite_test_nmgmeein_2022-06-14T20:45:18.877Z', groupId='index_parallel_metrics_maestro_v4_staging_lite_test_foopbemj_2022-06-13T22:17:40.070Z', taskResource=TaskResource{availabilityGroup='partial_index_generic_merge_metrics_maestro_v4_staging_lite_test_nmgmeein_2022-06-14T20:45:18.877Z', requiredCapacity=1}, dataSource='metrics_maestro_v4_staging_lite_test', context={forceTimeChunkLock=true, useLineageBasedSegmentAllocation=true}}] java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) ~[?:1.8.0_275] at java.util.zip.ZipFile.(ZipFile.java:225) ~[?:1.8.0_275] at java.util.zip.ZipFile.(ZipFile.java:155) ~[?:1.8.0_275] at java.util.zip.ZipFile.(ZipFile.java:169) ~[?:1.8.0_275] at org.apache.druid.utils.CompressionUtils.unzip(CompressionUtils.java:235) ~[druid-core-0.22.1.jar:0.22.1] at org.apache.druid.indexing.common.task.batch.parallel.HttpShuffleClient.fetchSegmentFile(HttpShuffleClient.java:92) ~[druid-indexing-service-0.22.1.jar:0.22.1] at org.apache.druid.indexing.common.task.batch.parallel.HttpShuffleClient.fetchSegmentFile(HttpShuffleClient.java:43) ~[druid-indexing-service-0.22.1.jar:0.22.1] at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.fetchSegmentFiles(PartialSegmentMergeTask.java:228) ~[druid-indexing-service-0.22.1.jar:0.22.1] at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.runTask(PartialSegmentMergeTask.java:171) ~[druid-indexing-service-0.22.1.jar:0.22.1] at org.apache.druid.indexing.common.task.batch.parallel.PartialGenericSegmentMergeTask.runTask(PartialGenericSegmentMergeTask.java:41) ~[druid-indexing-service-0.22.1.jar:0.22.1] at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:159) ~[druid-indexing-service-0.22.1.jar:0.22.1] at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:471) [druid-indexing-service-0.22.1.jar:0.22.1] at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:443) [druid-indexing-service-0.22.1.jar:0.22.1] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_275] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_275] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_275] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275] ```

I wonder if you hit this issue?

Hi @Mark_Herrera ! Thanks it might be that same issue - but it doesn’t look like there is any resolution there either?

Apologies because I forgot to welcome you. Welcome @ritvik-statsig! We have a pretty collegial community here.

As to your follow-up: it doesn’t look resolved to me either. You might try adding your example to the issue to try to draw more attention to it.

The only other thing that came to mind was resubmitting the job, but you said that “some always fail with this error.” Hopefully someone else will chime in.

Thanks Mark! Excited to be a part of this community, I got a lot of help from here when I was setting up my druid cluster. Hope to give back by answering some questions as well. I will comment on that github issue.