A random problem indexing files from s3 - "Unable to rename" - version 0.12.3

hi, I’m using druid 0.12.3 and sometimes I get a random error when I index files from s3,
after I run the same indexing task it’s working and I see in the indexing logs success
my deep storage is also in s3.

any idea what could cause this ?

the exception I get is (removed private details about the bucket and the datasource)

io.druid.indexer.JobHelper - Attempting rename from [s3n://s3bucket/druid/segments/datasource/2018-11-15T08:00:00.000Z_2018-11-15T09:00:00.000Z/2018-11-15T10:51:02.301Z/0/index.zip.0] to [s3n://s3bucket/druid/segments/datasource/2018-11-15T08:00:00.000Z_2018-11-15T09:00:00.000Z/2018-11-15T10:51:02.301Z/0/index.zip]

org.apache.hadoop.mapred.LocalJobRunner - reduce task executor complete.

org.apache.hadoop.mapred.LocalJobRunner - job_local535571586_0002

java.lang.Exception: io.druid.java.util.common.IOE: Unable to rename [s3n://s3bucket/druid/segments/datasource/2018-11-15T08:00:00.000Z_2018-11-15T09:00:00.000Z/2018-11-15T10:51:02.301Z/0/index.zip.0] to [s3n://s3bucket/druid/segments/datasource/2018-11-15T08:00:00.000Z_2018-11-15T09:00:00.000Z/2018-11-15T10:51:02.301Z/0/index.zip]

at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.7.3.jar:?]

at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.7.3.jar:?]

Caused by: io.druid.java.util.common.IOE: Unable to rename [s3n://s3bucket/druid/segments/datasource/2018-11-15T08:00:00.000Z_2018-11-15T09:00:00.000Z/2018-11-15T10:51:02.301Z/0/index.zip.0] to [s3n://s3bucket/druid/segments/datasource/2018-11-15T08:00:00.000Z_2018-11-15T09:00:00.000Z/2018-11-15T10:51:02.301Z/0/index.zip]

at io.druid.indexer.JobHelper.serializeOutIndex(JobHelper.java:447) ~[druid-indexing-hadoop-0.12.3.jar:0.12.3]

at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:750) ~[druid-indexing-hadoop-0.12.3.jar:0.12.3]

at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:500) ~[druid-indexing-hadoop-0.12.3.jar:0.12.3]

at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]

at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]

at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) ~[hadoop-mapreduce-client-common-2.7.3.jar:?]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_191]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_191]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_191]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_191]

at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_191]

Hi Eran,

in S3, you might get some transient errors while performing any S3 operations.

Druid is supposed to retry on those transient errors. Is your task failing because of this error?

Jihoon

i posted the exception i saw in indexing logs, is there any log i can check for more info with those errors?

and to answer your question, yes my task is failing because of that error