Containers in ingestion job failing : druid version : 0.9.2 : Java 7

Error logs :-

or more detailed output, check the application tracking page: http://oser402528.wal-mart.com:8088/cluster/app/application_1517538669309_143406 Then click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e107_1517538669309_143406_03_000001
Exit code: 1
Stack trace: org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException: Launch container failed
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:109)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:89)
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:392)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

Shell output: main : command provided 1
main : run as user is styagi
main : requested yarn user is styagi
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file /u/applic/data/hdfs11/hadoop/yarn/local/nmPrivate/application_1517538669309_143406/container_e107_1517538669309_143406_03_000001/container_e107_1517538669309_143406_03_000001.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...

Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
2018-02-15T05:42:03,467 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 0
2018-02-15T05:42:03,468 ERROR [task-runner-0-priority-0] io.druid.indexer.DetermineHashedPartitionsJob - Job failed: job_1517538669309_143406
2018-02-15T05:42:03,469 INFO [task-runner-0-priority-0] io.druid.indexer.JobHelper - Deleting path[var/druid/hadoop-tmp/wikiticker/2018-02-15T054101.926Z_1dfc63f7571c417cb3351c08073d8a5c]
2018-02-15T05:42:03,528 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_wikiticker_2018-02-15T05:41:01.927Z, type=index_hadoop, dataSource=wikiticker}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:204) ~[druid-indexing-service-0.9.2.1.jar:0.9.2.1]
	at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:175) ~[druid-indexing-service-0.9.2.1.jar:0.9.2.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.2.1.jar:0.9.2.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.2.1.jar:0.9.2.1]
	at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_71]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_71]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_71]
	at java.lang.Thread.run(Thread.java:745) [?:1.7.0_71]
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_71]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_71]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_71]
	at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_71]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.2.1.jar:0.9.2.1]
	... 7 more
Caused by: com.metamx.common.ISE: Job[class io.druid.indexer.DetermineHashedPartitionsJob] failed!
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:369) ~[druid-indexing-hadoop-0.9.2.1.jar:0.9.2.1]
	at io.druid.indexer.HadoopDruidDetermineConfigurationJob.run(HadoopDruidDetermineConfigurationJob.java:91) ~[druid-indexing-hadoop-0.9.2.1.jar:0.9.2.1]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopDetermineConfigInnerProcessing.runTask(HadoopIndexTask.java:291) ~[druid-indexing-service-0.9.2.1.jar:0.9.2.1]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_71]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_71]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_71]
	at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_71]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.2.1.jar:0.9.2.1]
	... 7 more
2018-02-15T05:42:03,538 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_wikiticker_2018-02-15T05:41:01.927Z] status changed to [FAILED].
2018-02-15T05:42:03,541 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_wikiticker_2018-02-15T05:41:01.927Z",
  "status" : "FAILED",
  "duration" : 55517
}
2018-02-15T05:42:03,555 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBased









Logs from container :


Application application_1517538669309_143307 failed 3 times due to AM Container for appattempt_1517538669309_143307_000003 exited with exitCode: 1

For more detailed output, check the application tracking page: http://oser402528.wal-mart.com:8088/cluster/app/application_1517538669309_143307 Then click on links to logs of each attempt.

Diagnostics: Exception from container-launch.

Container id: container_e107_1517538669309_143307_03_000001

Exit code: 1

Stack trace: org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException: Launch container failed

at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:109)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:89)

at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:392)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Shell output: main : command provided 1

main : run as user is styagi

main : requested yarn user is styagi

Getting exit code file...

Creating script paths...

Writing pid file...

Writing to tmp file /u/applic/data/hdfs14/hadoop/yarn/local/nmPrivate/application_1517538669309_143307/container_e107_1517538669309_143307_03_000001/container_e107_1517538669309_143307_03_000001.pid.tmp

Writing to cgroup task files...

Creating local dirs...

Launching container...

Getting exit code file...

Creating script paths...

Container exited with a non-zero exit code 1

Failing this attempt. Failing the application.

Is it somehow related to jars (guava) not getting copied to containers ?

i have already tried setting property in middleManager as given here : http://druid.io/docs/0.9.2/operations/other-hadoop.html

mapreduce.job.classloader = true

mapreduce.job.user.classpath.first = true

Anyone… any suggestions or solution on this ?

I have the same problem.
Do you konw how to fix it?

Thanks.