segmentDescriptorInfo does not exist - Hadoop HA Mode

I am currently getting the following stack trace when I am trying to batch index some data.

curl -X 'POST' -H 'Content-Type:application/json' -d @myapp_V1_daily-test-01.json localhost:8090/druid/indexer/v1/task

My Hadoop cluster is running in HA (High Availability) mode which was configured using Apache Ambari. The same configuration works for my Single Node Non-HA Hadoop Cluster.

Stacktrace:

2016-03-08T16:39:48,965 ERROR [task-runner-0] io.druid.indexer.IndexGeneratorJob - [File /myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/segmentDescriptorInfo does not exist.] SegmentDescriptorInfo is not found usually when indexing process did not produce any segments meaning either there was no input data to process or all the input events were discarded due to some error

2016-03-08T16:39:48,968 ERROR [task-runner-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_myapp_V1_2016-03-08T16:38:37.949Z, type=index_hadoop, dataSource=myapp_V1}]

java.lang.RuntimeException: java.lang.reflect.InvocationTargetException

at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:138) ~[druid-indexing-service-0.8.3.jar:0.8.3]

at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:206) ~[druid-indexing-service-0.8.3.jar:0.8.3]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:285) [druid-indexing-service-0.8.3.jar:0.8.3]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:265) [druid-indexing-service-0.8.3.jar:0.8.3]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_71]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_71]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_71]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_71]

Caused by: java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_71]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_71]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_71]

at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_71]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:135) ~[druid-indexing-service-0.8.3.jar:0.8.3]

… 7 more

Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File /myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/segmentDescriptorInfo does not exist.

at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]

at io.druid.indexer.IndexGeneratorJob.getPublishedSegments(IndexGeneratorJob.java:109) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]

at io.druid.indexer.HadoopDruidIndexerJob$1.run(HadoopDruidIndexerJob.java:89) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]

at io.druid.indexer.JobHelper.runJobs(JobHelper.java:321) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]

at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:96) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]

at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:259) ~[druid-indexing-service-0.8.3.jar:0.8.3]

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_71]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_71]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_71]

at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_71]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:135) ~[druid-indexing-service-0.8.3.jar:0.8.3]

… 7 more

Caused by: java.io.FileNotFoundException: File /myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/segmentDescriptorInfo does not exist.

at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:795) ~[hadoop-hdfs-2.7.1.jar:?]

at org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:106) ~[hadoop-hdfs-2.7.1.jar:?]

at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:853) ~[hadoop-hdfs-2.7.1.jar:?]

at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:849) ~[hadoop-hdfs-2.7.1.jar:?]

at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-2.7.1.2.3.4.0-3485.jar:?]

at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:860) ~[hadoop-hdfs-2.7.1.jar:?]

at io.druid.indexer.IndexGeneratorJob.getPublishedSegments(IndexGeneratorJob.java:97) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]

at io.druid.indexer.HadoopDruidIndexerJob$1.run(HadoopDruidIndexerJob.java:89) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]

at io.druid.indexer.JobHelper.runJobs(JobHelper.java:321) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]

at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:96) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]

at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:259) ~[druid-indexing-service-0.8.3.jar:0.8.3]

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_71]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_71]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_71]

at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_71]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:135) ~[druid-indexing-service-0.8.3.jar:0.8.3]

… 7 more

2016-03-08T16:39:48,986 INFO [task-runner-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {

“id” : “index_hadoop_myapp_V1_2016-03-08T16:38:37.949Z”,

“status” : “FAILED”,

“duration” : 14540

}

``

druid.indexer.task.hadoopWorkingPath:

/usr/hdp/current/hadoop-client/bin/hadoop fs -ls hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/

Found 27 items

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/_SUCCESS

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00000

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00001

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00002

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00003

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00004

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00005

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00006

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00007

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00008

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00009

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00010

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00011

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00012

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00013

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00014

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00015

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00016

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00017

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00018

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00019

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00020

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00021

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00022

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00023

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00024

-rw-r–r-- 3 myapp myapp 0 2016-03-08 11:40 hdfs://hadoopc:8020/myapp/druid-data/druid-indexing/myapp_V1/2016-03-08T163922.256Z/part-r-00025

``

Configuration:

druid.indexer.runner.javaOpts=-server -Xmx2g -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps

druid.indexer.task.defaultHadoopCoordinates=[“org.apache.hadoop:hadoop-client:2.7.1”,“org.apache.hadoop:hadoop-hdfs:2.7.1”,“org.apache.hadoop:hadoop-common:2.7.1”]

druid.indexer.task.baseDir=/opt/myapp/druid-data/base

druid.indexer.task.baseTaskDir=/opt/myapp/druid-data/base/tasks

druid.indexer.task.hadoopWorkingPath=/myapp/druid-data/druid-indexing

``

Druid Process Classpath:

ps -ef | grep druid
myapp 10012 10000 2 17:16 ? 00:00:19 /usr/bin/java -Xms64m -Xmx64m -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/opt/myapp/druid-tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dhadoop.fs.defaultFS=hdfs://hadoopc:8020 -Dhadoop.dfs.nameservices=hadoopc -Dhadoop.dfs.ha.namenodes.hadoopc=nn1,nn2 -Dhadoop.dfs.namenode.rpc-address.hadoopc.nn1=hadoopm01.myapp.example.com:8020 -Dhadoop.dfs.namenode.rpc-address.hadoopc.nn2=hadoopm02.myapp.example.com:8020 -Dhadoop.dfs.client.failover.proxy.provider.hadoopc=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider -classpath config/_common:config/middlemanager:lib/::/usr/hdp/current/hadoop-client/:/usr/hdp/current/hadoop-client/lib/*:/opt/myapp/druid-classpath io.druid.cli.Main server middleManager

myapp 10015 9999 3 17:16 ? 00:00:24 /usr/bin/java -Xms2g -Xmx2g -XX:NewSize=256m -XX:MaxNewSize=256m -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/opt/myapp/druid-tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dhadoop.fs.defaultFS=hdfs://hadoopc:8020 -Dhadoop.dfs.nameservices=hadoopc -Dhadoop.dfs.ha.namenodes.hadoopc=nn1,nn2 -Dhadoop.dfs.namenode.rpc-address.hadoopc.nn1=hadoopm01.myapp.example.com:8020 -Dhadoop.dfs.namenode.rpc-address.hadoopc.nn2=hadoopm02.myapp.example.com:8020 -Dhadoop.dfs.client.failover.proxy.provider.hadoopc=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider -classpath config/_common:config/overlord:lib/::/usr/hdp/current/hadoop-client/:/usr/hdp/current/hadoop-client/lib/*:/opt/myapp/druid-classpath io.druid.cli.Main server overlord

myapp 10017 10001 2 17:16 ? 00:00:21 /usr/bin/java -Xms2g -Xmx2g -XX:NewSize=512m -XX:MaxNewSize=512m -XX:+UseG1GC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/opt/myapp/druid-tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dhadoop.fs.defaultFS=hdfs://hadoopc:8020 -Dhadoop.dfs.nameservices=hadoopc -Dhadoop.dfs.ha.namenodes.hadoopc=nn1,nn2 -Dhadoop.dfs.namenode.rpc-address.hadoopc.nn1=hadoopm01.myapp.example.com:8020 -Dhadoop.dfs.namenode.rpc-address.hadoopc.nn2=hadoopm02.myapp.example.com:8020 -Dhadoop.dfs.client.failover.proxy.provider.hadoopc=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider -classpath config/_common:config/coordinator:lib/::/usr/hdp/current/hadoop-client/:/usr/hdp/current/hadoop-client/lib/*:/opt/myapp/druid-classpath io.druid.cli.Main server coordinator

myapp 10018 10002 2 17:16 ? 00:00:19 /usr/bin/java -Xms4g -Xmx4g -XX:NewSize=2g -XX:MaxNewSize=2g -XX:MaxDirectMemorySize=8g -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/opt/myapp/druid-tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dhadoop.fs.defaultFS=hdfs://hadoopc:8020 -Dhadoop.dfs.nameservices=hadoopc -Dhadoop.dfs.ha.namenodes.hadoopc=nn1,nn2 -Dhadoop.dfs.namenode.rpc-address.hadoopc.nn1=hadoopm01.myapp.example.com:8020 -Dhadoop.dfs.namenode.rpc-address.hadoopc.nn2=hadoopm02.myapp.example.com:8020 -Dhadoop.dfs.client.failover.proxy.provider.hadoopc=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider -classpath config/_common:config/historical:lib/::/usr/hdp/current/hadoop-client/:/usr/hdp/current/hadoop-client/lib/*:/opt/myapp/druid-classpath io.druid.cli.Main server historical

myapp 10019 10003 2 17:16 ? 00:00:19 /usr/bin/java -Xms4g -Xmx4g -XX:NewSize=1g -XX:MaxNewSize=1g -XX:MaxDirectMemorySize=8g -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/opt/myapp/druid-tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dhadoop.fs.defaultFS=hdfs://hadoopc:8020 -Dhadoop.dfs.nameservices=hadoopc -Dhadoop.dfs.ha.namenodes.hadoopc=nn1,nn2 -Dhadoop.dfs.namenode.rpc-address.hadoopc.nn1=hadoopm01.myapp.example.com:8020 -Dhadoop.dfs.namenode.rpc-address.hadoopc.nn2=hadoopm02.myapp.example.com:8020 -Dhadoop.dfs.client.failover.proxy.provider.hadoopc=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider -classpath config/_common:config/broker:lib/::/usr/hdp/current/hadoop-client/:/usr/hdp/current/hadoop-client/lib/*:/opt/myapp/druid-classpath io.druid.cli.Main server broker

``

I think I have addressed this issue. To get the following command to run, I had to place my rebuild/resegment csv files into HDFS. Maybe because I am using HDFS for Deep Storage?

curl -X ‘POST’ -H ‘Content-Type:application/json’ -d @myapp_V1_daily-test-01.json localhost:8090/druid/indexer/v1/task

``

Snippet from druid/config/_common/common.runtime.properties

Deep Storage

druid.storage.type=hdfs

druid.storage.storageDirectory=hdfs://hadoopc:8020/smyapp/druid-hdfs-storage

``

Snippet from myapp_V1_daily-test-01.json, points to a location that I have already uploaded my csv to.

“ioConfig”: {

“type”: “hadoop”,

“inputSpec”: {

“type”: “static”,

“paths”: “/opt/myapp/druid-replay/tmp/myapp_V1_full”,

“filter”: “rebuild-*.csv”

}

},

``

/usr/hdp/current/hadoop-client/bin/hadoop fs -ls /myapp/druid-replay/myapp_V1_full/

Found 9 items

-rw-r–r-- 3 myapp myapp 9422282 2016-03-09 14:40 /myapp/druid-replay/myapp_V1_full/rebuild-1.csv

-rw-r–r-- 3 myapp myapp 9827898 2016-03-09 14:40 /myapp/druid-replay/myapp_V1_full/rebuild-2.csv

-rw-r–r-- 3 myapp myapp 9833933 2016-03-09 14:40 /myapp/druid-replay/myapp_V1_full/rebuild-3.csv

-rw-r–r-- 3 myapp myapp 9745977 2016-03-09 14:40 /myapp/druid-replay/myapp_V1_full/rebuild-4.csv

-rw-r–r-- 3 myapp myapp 9804340 2016-03-09 14:40 /myapp/druid-replay/myapp_V1_full/rebuild-5.csv

-rw-r–r-- 3 myapp myapp 9804828 2016-03-09 14:40 /myapp/druid-replay/myapp_V1_full/rebuild-6.csv

-rw-r–r-- 3 myapp myapp 9907675 2016-03-09 14:40 /myapp/druid-replay/myapp_V1_full/rebuild-7.csv

-rw-r–r-- 3 myapp myapp 10962590 2016-03-09 14:40 /myapp/druid-replay/myapp_V1_full/rebuild-8.csv

-rw-r–r-- 3 myapp myapp 728554 2016-03-09 14:40 /myapp/druid-replay/myapp_V1_full/rebuild-9.csv

``

On another related note, does anyone know off the top of their head, whether are the following path values are HDFS or local paths? Can I set this manually? What is a best practice?

druid.indexer.task.baseDir=/opt/myapp/druid-data/base

druid.indexer.task.baseTaskDir=/opt/myapp/druid-data/base/tasks

druid.indexer.task.hadoopWorkingPath=/opt/myapp/druid-data/druid-indexing

``