Can't get hadoop index task to work

Hi - I’m trying to submit a hadoop index task to Overlord and get it to run on a remote Hadoop cluster. Currently I’m doing this all locally in Docker containers as a PoC [1]. So all of the Druid nodes run in separate Docker containers, and Hadoop (HDFS and Yarn) also run in a Docker container [2].

I can submit the task [3] to the Overlord, and it seems to successfully get the data from HDFS, index it and create a segment [4].

However there are two problems:

  1. the job is not run on the remote hadoop cluster; it uses LocalJobRunner

  2. the job is unable to write the segment to HDFS: a “Pathname … is not a valid DFS filename” exception gets thrown [4]

I must not be setting things up properly, but can’t find anything else to try. I would hugely appreciate any pointers in the right direction.

Thanks,

Zach

[1] https://github.com/Banno/druid-docker/tree/install-hadoop

[2] https://registry.hub.docker.com/u/sequenceiq/hadoop-docker/

[3] https://github.com/zcox/druid-pageviews/blob/master/task5-hadoop.json

[4] https://gist.githubusercontent.com/zcox/bc2861977509cb111683/raw/56b5f0cd2b0f8fcc812393886650b39b54fb33a1/gistfile1.txt

Hi Zach
i am trying to understand your setup.

So when you say a remote Hadoop cluster what does remote means ?

You might know that druid expect to load the configuration about the Hadoop cluster out of the class path, are you providing that ?

Hi Zach,

I had similar error but after upgrading to 0.7.1, it went away. Can’t be sure but fix was probably in the PR [1].

[1] https://github.com/druid-io/druid/pull/1121

Hi Slim & Prajwal - I upgraded to Druid 0.7.1.1 and now my problem #2 is solved. The hadoop index job properly writes the segment to HDFS, and the historical node picks it up and serves it. Queries are answered correctly. You can see the index task output [1].

However, I still have problem #1, as you can see in [1]. The job still uses LocalJobRunner, and the job does not run on the “remote” Hadoop cluster. I think I do not have the Hadoop config files set up properly on the Overlord node. I’m including a mapred-site.xml on the Overlord [2] and it just has:

<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>hadoop:8032</value>
  </property>
</configuration>

That file gets placed in /opt/hadoop/etc/hadoop on the Overlord node, and that dir is included in the Overlord’s classpath [3].

Because fig uses Docker links between these containers, Docker writes a line like this in the Overlord’s /etc/hosts:

172.17.0.159 hadoop

That IP allows the Overlord node to communicate with the Hadoop node.

The hadoop image I’m using [4] ends up running Yarn, not MapReduce (i.e. no job tracker). The Yarn resource manager listens on 8032:

yarn.resourcemanager.address

fc80d6b33899:8032

programatically

I am not a Hadoop expert, and just kinda took a guess that’s what I should use in the Overlord’s mapred-site.xml file. I’m guessing I just don’t have that, or some other file hadoop conf xml file, containing the right settings. Any idea what I should try next?

Thanks,

Zach

[1] https://gist.githubusercontent.com/zcox/01e8abfe0922b49ea55a/raw/a40658e2a6f59f06dfa241b00d42d9a8ba3616b2/gistfile1.txt

[2] https://github.com/Banno/druid-docker/blob/install-hadoop/hadoop-base/mapred-site.xml

[3] https://github.com/Banno/druid-docker/blob/install-hadoop/hadoop-base/Dockerfile#L19

[4] https://registry.hub.docker.com/u/sequenceiq/hadoop-docker/

Well with this commit [1] with more settings in mapred-site.xml and yarn-site.xml on Overlord, the job now does get submitted to the remote hadoop, however the job fails [2]. I’m not sure if it fails due to a Druid problem or a Hadoop problem - will continue to debug and report progress.

[1] https://github.com/Banno/druid-docker/commit/05c808c8eefe2e500b7fd139fc3a64601b9148cc

[2] https://gist.githubusercontent.com/zcox/37245d41825633c9b8b4/raw/b651a1d00ec55aa5065182f5dd556726eecb892c/gistfile1.txt

Hi Zach,

The error “File file:/tmp/hadoop-yarn/staging/root/.staging/job_1428703369275_0003/job.xml does not exist” makes me think that something is not wired up right with the hadoop filesystems. That path should be on HDFS. What’s your core-site.xml fs.defaultFS set to? Can you try setting it to your HDFS NameNode if it’s not already?

Hi Gian - core-site.xml in this Hadoop Docker container [1] contains only this:

  <property>

      <name>fs.defaultFS</name>

      <value>hdfs://cbb067ae55a4:9000</value>

  </property>

cbb067ae55a4 is the Docker container’s hostname, and /etc/hosts maps it to the container’s IP.

bash-4.1# hostname

cbb067ae55a4

bash-4.1# cat /etc/hosts

172.17.0.193 cbb067ae55a4

127.0.0.1 localhost

::1 localhost ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

I also tried upgrading from Hadoop 2.4.1 to 2.6.0 and get the same error [2].

[1] https://github.com/sequenceiq/hadoop-docker/blob/master/core-site.xml.template

[2] https://gist.githubusercontent.com/zcox/44e35eb369c26491fe8f/raw/f593122acdf468119a2737c4f3f894135508922c/gistfile1.txt

Is that core-site.xml (and all the other xmls) also on the classpath of your druid middleManager?

Hi Gian - sorry, I was confused earlier. Now I have this core-site.xml on Overlord [1]

<name>[fs.default.name](http://fs.default.name)</name>

<value>hdfs://hadoop:9000</value>
<name>fs.defaultFS</name>

<value>hdfs://hadoop:9000</value>

The job now fails in a different way [2]:

Diagnostics: java.net.UnknownHostException: hadoop

The Overlord container does have an entry in /etc/hosts for hadoop:

[ root@fcb4f04c9d05:/data ]$ cat /etc/hosts

172.17.0.7 fcb4f04c9d05

127.0.0.1 localhost

::1 localhost ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

172.17.0.6 hadoop_1

172.17.0.2 postgres

172.17.0.2 postgres_1

172.17.0.3 druiddocker_zookeeper_1

172.17.0.6 hadoop

172.17.0.3 zookeeper

172.17.0.3 zookeeper_1

172.17.0.6 druiddocker_hadoop_1

172.17.0.2 druiddocker_postgres_1

However, that doesn’t exist in /etc/hosts in the Hadoop container itself:

bash-4.1# cat /etc/hosts

172.17.0.6 0eed1244556e

127.0.0.1 localhost

::1 localhost ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

I’m guessing something related to hdfs://hadoop:9000 gets put into the job in the Overlord container, but then when the job actually runs in the Hadoop container that hadoop host name is unresolvable. I’ll try to figure out a better way for these hostnames to get resolved, and report back.

Thanks,

Zach

[1] https://github.com/Banno/druid-docker/blob/install-hadoop/hadoop-base/core-site.xml

[2] https://gist.githubusercontent.com/zcox/fca6a58e3736ed97d3c2/raw/294e7242cf2dff36fe73bd6f42ccd1821f9eb93e/gistfile1.txt

I changed some things to use resolvable host names (e.g. hadoop.dev.banno.com) [1] and now the job seems to get farther but still fails, see druid task output [2] and yarn syslog [3]. The yarn error boils down to:

2015-04-11 12:04:52,192 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.NoSuchMethodError: org.apache.hadoop.ipc.RPC.getProtocolProxy(Ljava/lang/Class;JLjava/net/InetSocketAddress;Lorg/apache/hadoop/security/UserGroupInformation;Lorg/apache/hadoop/conf/Configuration;Ljavax/net/SocketFactory;ILorg/apache/hadoop/io/retry/RetryPolicy;Ljava/util/concurrent/atomic/AtomicBoolean;)Lorg/apache/hadoop/ipc/ProtocolProxy;

I have no idea what that means, will have to dig into it.

[1] https://github.com/Banno/druid-docker/commit/9429f42b9614cbdf8536e2e60f91541481ae1b04

[2] https://gist.githubusercontent.com/zcox/7b70f447a99bd7e9aded/raw/124b4d87e8d7aa67313ed6205380e89264327ff6/gistfile1.txt

[3] https://gist.githubusercontent.com/zcox/0d72c2dd45d23842b2a1/raw/618ff09f43ed7f2df87c97b93703516912f8897d/gistfile1.txt

When googling that error message, I got a few hints about issues with different hadoop versions on the classpath.

I switched to Hadoop 2.4.1 [1] [2] and now the yarn error message is slightly different, but pretty much the same:

2015-04-11 13:10:40,853 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.NoSuchMethodError: org.apache.hadoop.http.HttpConfig.isSecure()Z
	at org.apache.hadoop.yarn.conf.YarnConfiguration.<clinit>(YarnConfiguration.java:322)
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1374)
2015-04-11 13:10:40,869 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1

If I examine the job’s classpath deps in hdfs it does seem that both hadoop 2.4.1 and 2.3.0 jars were uploaded there. Maybe that’s the problem? I’m clearly specifying “hadoopDependencyCoordinates”: [“org.apache.hadoop:hadoop-client:2.4.1”] in the hadoop index task. Why would both 2.4.1 and 2.3.0 jars be uploaded there?

[1] https://github.com/Banno/druid-docker/commit/ef0e9096f7c6a222aa04d223c87d79eae4fd26ae

[2] https://github.com/zcox/druid-pageviews/commit/df794d3c3780e38b478d4bb118b2e72cbdd08aa6

[3] https://gist.githubusercontent.com/zcox/56c77860fb75bb844f80/raw/a3e550d95155d84fb7365a81a69dd9fb368f1f74/gistfile1.txt

I also put this in common.runtime.properties:

druid.indexer.task.defaultHadoopCoordinates=[“org.apache.hadoop:hadoop-client:2.4.1”]

However there are still both hadoop 2.4.1 and 2.3.0 jars in /druid-working/classpath in hdfs.

How do I get just the hadoop 2.4.1 jars uploaded to hdfs?

You might need to compile a version of Druid based against hadoop 2.4.1 instead of 2.3.0.

I had a hunch that the hadoop 2.3.0 jars were being included by the druid-hdfs-storage extension, so I removed that from druid.extensions.coordinates. Now the job seems to get farther but still fails. See task output from overlord [1] and job syslog output [2]. The exception in [2] is this:

2015-04-12 14:51:25,093 ERROR [IPC Server handler 2 on 35430] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1428864489761_0001_m_000000_0 - exited : [com.metamx.common.RE](http://com.metamx.common.RE): Failure on row[{"eventId":"e1", "timestamp":"2015-03-24T14:00:00Z", "userId":"u1", "url":"[http://site.com/1](http://site.com/1)"}]
	at io.druid.indexer.HadoopDruidIndexerMapper.map(HadoopDruidIndexerMapper.java:98)
	at io.druid.indexer.HadoopDruidIndexerMapper.map(HadoopDruidIndexerMapper.java:44)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.NullPointerException
	at io.druid.indexer.HadoopDruidIndexerConfig.getBucket(HadoopDruidIndexerConfig.java:350)
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorMapper.innerMap(IndexGeneratorJob.java:231)
	at io.druid.indexer.HadoopDruidIndexerMapper.map(HadoopDruidIndexerMapper.java:94)
	... 9 more

Not sure what the problem is though, this exact same data set gets indexed just fine by a non-hadoop index task [3]. For reference, [4] is the hadoop index task I’m using, and [5] is the data set. Am I specifying something incorrectly in the hadoop index task spec json?

[1] https://gist.githubusercontent.com/zcox/b6eddaea76343baebad9/raw/a1cdfda0abaaa65d9a245df39fce7ed17a778946/gistfile1.txt

[2] https://gist.githubusercontent.com/zcox/0fc76242702fb803c0c1/raw/d3c64a1bde1f50e377bedbc4027e1de90b6652f6/gistfile1.txt

[3] https://github.com/zcox/druid-pageviews/blob/master/task5.json

[4] https://github.com/zcox/druid-pageviews/blob/master/task5-hadoop.json

[5] https://github.com/zcox/druid-pageviews/blob/master/events.json

Nothing looks wrong with your configs at first glance. This looks sort of like a time bucketing problem- are all of your services running in the same time zone?

Also, do you have logs from one of the mappers that failed? You should be able to get them from the hadoop web ui. There might be something interesting in there.

Hi Gian,

I’m starting all Druid nodes with -Duser.timezone=UTC [1].

I can’t get to any mapper-specific logs via yarn - I can only seem to get the container’s syslog output [2].

The exception seems to occur at this line: [3]

final ShardSpec actualSpec = shardSpecLookups.get(timeBucket.get().getStart()).getShardSpec(rollupGran.truncate(inputRow.getTimestampFromEpoch()), inputRow);

Specifically, shardSpecLookups.get(timeBucket.get().getStart()) appears to return null. In the task logs [4] the index spec has this in tuningConfig:

      "shardSpecs" : {
        "2015-03-24T00:00:00.000Z" : [ {
          "actualSpec" : {
            "type" : "none"
          },
          "shardNum" : 0
        } ]
      },

Any ideas why that shardSpecLookups.get() call would return null?

Thanks,

Zach

[1] https://github.com/Banno/druid-docker/blob/install-hadoop/base/start-node.sh#L36
[2] https://gist.githubusercontent.com/zcox/0fc76242702fb803c0c1/raw/d3c64a1bde1f50e377bedbc4027e1de90b6652f6/gistfile1.txt

[3] https://github.com/druid-io/druid/blob/druid-0.7.1.1/indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfig.java#L350

[4] https://gist.githubusercontent.com/zcox/b6eddaea76343baebad9/raw/a1cdfda0abaaa65d9a245df39fce7ed17a778946/gistfile1.txt

I finally got the Hadoop batch index task to work - the problem was that the Hadoop Docker container I was using is based on CentOS which appears to use Eastern time zone by default. After forcing it to use UTC then the Druid mapreduce job ran successfully, created the segment and put it in hdfs.

I also had to use Hadoop 2.3.0. If I use any other version of Hadoop, then the dependency jars for both that version and 2.3.0 are used in the mapreduce job, which causes problems. I think the druid-hdfs-storage extension is always pulling in 2.3.0. Will look into that next.

I just ran into this exact issue. The Hadoop job is labeling the segments put in the working directory with a timezone suffix and it later on looks for the same segments without the timezone suffix. I was able to get the indexer to complete successfully by omitting the -Duser.timezone=UTC bit in the, but I would prefer to standardize on UTC time. Is there a way to force the Hadoop job to use UTC, or did you have to actually update the host OS on all the datanodes?

Taylor, you should be able to add -Duser.timezone=UTC in the
java.child.opts, which *should* make the child JVMs on Hadoop use the
UTC timezone.

Or, is that what you said you removed to get things to work?

--Eric

Hi Taylor,

I had to use UTC everywhere: on every single Druid node using -Duser.timezone=UTC and also on every Hadoop node. Since the Hadoop I’m using is based on CentOS I had to do that at the OS level, using:

rm /etc/localtime

ln -s /usr/share/zoneinfo/UTC /etc/localtime