Druid indexing task failing with auxService:mapreduce_shuffle error

Hi,
my envt:

running hadoop 2.7.3 on pseudo distribution mode on local mac

druid version locally deployed: 0.9.2

indexing task = hdfs

When running indexing task, my MR job failing with error:

2017-03-21T04:58:38,424 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1490069884381_0002_m_000000_2, Status : FAILED

Container launch failed for container_1490069884381_0002_01_000004 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:423)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)

at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155)

at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:375)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

I am providing output of “$hadoop classpath” as part of classpath while running overlord and middle manager. Also, my yarn-site.xml has:

<property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce_shuffle</value>

</property>

<property>

    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>

    <value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

command to run various services:

java cat conf-quickstart/druid/historical/jvm.config | xargs -cp “conf-quickstart/druid/_common:conf-quickstart/druid/historical:lib/:/usr/local/Cellar/hadoop/2.7.3/libexec/etc/hadoop:/usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/common/lib/:/usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/common/:/usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/hdfs:/usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/hdfs/lib/:/usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/hdfs/:/usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/yarn/lib/:/usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/yarn/:/usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/mapreduce/lib/:/usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/mapreduce/*:/Users/abcd/documents/poc/druid/druid-0.9.2/extensions” io.druid.cli.Main server [overlord|middle-manager]

Any help?

Weird… Seems that your yarn configuration file isn’t included to druid’s classpath even though your command seems to include it. Would you double check please?
Also, you can simply test by copying your hadoop configuration files (core-site, yarn-site, etc) to druid’s common configuration directory.

Jihoon

2017년 3월 21일 (화) 오후 2:05, Yogesh Agrawal yogeshkagrawal@gmail.com님이 작성:

Hi Jihoon,
Thanks for your reply. My hadoop configurations are already copied to conf/druid/_common folder. Also, I played around with {"“mpreduce.job.user.classpath.first” : true} but same error. any pointers will be of great help.

Thanks,

-Y

One thing I noticed, under “druid/druid-0.9.2/extensions/druid-hdfs-storage”, I have hadoop jars with v2.3 but I have 2.7.3 hadoop version running. that may be causing some issue. will get latest versions of jars and try again.

I installed hadoop 2.6.0 and druid 0.9.2 and it worked with these configurations:

  1. “hadoopDependencyCoordinates”: [“org.apache.hadoop:hadoop-client:2.6.0”]

  2. “jobProperties”: {

“mapreduce.job.classloader”: true,

“mapreduce.map.java.opts”: “-Djava.net.preferIPv4Stack=true -Xmx3865051136 -Duser.timezone=UTC -Dfile.encoding=UTF-8”,

“mapreduce.reduce.java.opts”: “-Djava.net.preferIPv4Stack=true -Xmx3865051136 -Duser.timezone=UTC -Dfile.encoding=UTF-8”,

“mapreduce.job.classloader.system.classes”: “-javax.validation.,java.,javax.,org.apache.commons.logging.,org.apache.log4j.,org.apache.hadoop.,org.w3c.,org.xml.”

}

  1. druid.extensions.loadList=[“druid-hdfs-storage”]

  2. druid.extensions.hadoopDependenciesDir=/Users/druidUser/druid-0.9.2/hadoop-dependencies

  3. copied all hadoop xmls to druid/_common path

  4. Providing path to extensions directory in classpath

with same configuration, not working with HDP 2.7.3. for now I am fine but if anyone has inputs, please provide.

Thanks,