[Segment reindexing on hadoop] Error: java.io.IOException: Output Stream closed

Hello,
I am trying to do segment reindexing on hadoop.
We use hybrid setup with: **S3** for **deep-storage**, dedicated hosting for **Druid**-0.10 servers and **Hadoop** **(HDP 2.6)**.
We used to use EMR for reindexing but we give up this idea due to the big price for using EMR.

Hence we setup our own Hadoop instead, unfortunately we stuck with problems.

We used the following properties for core-site.xml:

      <name>fs.s3.buffer.dir</name>
      <name>fs.s3.impl</name>
      <name>fs.s3n.impl</name>
      <name>fs.s3n.endpoint</name>
      <name>fs.s3.buckets.create.region</name>
      <name>fs.s3.awsAccessKeyId</name>
      <name>fs.s3.awsSecretAccessKey</name>
      <name>fs.s3n.awsAccessKeyId</name>
      <name>fs.s3n.awsSecretAccessKey</name>
      <name>fs.s3a.access.key</name>
      <name>fs.s3a.secret.key</name>
      <name>com.amazonaws.services.s3.enableV4</name>
      <name>fs.s3a.endpoint</name>

and we have the following IOException:

2017-08-08T19:04:15,534 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 53%
2017-08-08T19:04:24,562 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 54%
2017-08-08T19:04:35,596 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 55%
2017-08-08T19:04:36,600 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1501709513319_0009_r_000002_1, Status : FAILED
Error: java.io.IOException: Output Stream closed
at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen(S3AOutputStream.java:83)
at org.apache.hadoop.fs.s3a.S3AOutputStream.flush(S3AOutputStream.java:89)
at java.io.FilterOutputStream.flush(FilterOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at io.druid.indexer.JobHelper$4.push(JobHelper.java:401)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
at com.sun.proxy.$Proxy91.push(Unknown Source)
at io.druid.indexer.JobHelper.serializeOutIndex(JobHelper.java:412)
at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:734)
at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:478)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

2017-08-08T19:04:37,604 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 35%
2017-08-08T19:04:47,755 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 36%
2017-08-08T19:05:06,141 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 37%
2017-08-08T19:05:18,360 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 38%
2017-08-08T19:05:33,656 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 39%


What can be wrong?

Regards,
Jan


Have you been able to resolve this issue?

I am getting the same exception with druid 0.11.0 writing to an s3 bucket in a v4 region with s3a.

Any hint is appreciated!

Thanks

Christoph