Druid 0.9.2 and HDP 2.5.3.0 (Hadoop 2.7.3) Indexing rm1,rm2 failover issue

Hello Druid friends!

I’m attempting to configure a Druid 0.9.2 cluster to interface with a Kerberos HDP 2.5.3.0 (Hadoop 2.7.3) cluster. 0.9.2 is especially appealing to me due to the added Kerberos support (among other great features)!

Druid by itself is running great, interfacing with YARN on the HDP cluster however, is giving me some trouble.

After submitting the wikiticker test via curl, when it comes time for druid to submit indexing jobs to YARN, I get the following error 30 times:

org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2
org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

I’ve done a bit of searching on here and this seems to be a common issue, usually relating to one of four things:

  1. Hadoop coordinates not being passed
    I’m currently passing: “hadoopDependencyCoordinates”: [“org.apache.hadoop:hadoop-client:2.7.3”]

  2. Hadoop dependencies are not the correct version
    I’m currently using hadoop dependencies 2.7.3, I used pull-deps to obtain them
    I’ve tried provided 2.3.0, 2.7.1 and 2.7.3:
    bash-4.1$ ls -al /opt/druid/hadoop-dependencies/hadoop-client/
    total 20
    drwxr-xr-x. 5 root root 4096 Feb 3 12:04 .
    drwxr-xr-x. 3 root root 4096 Jan 24 13:44 …
    drwxr-xr-x. 2 root root 4096 Feb 3 08:54 2.3.0
    drwxr-xr-x. 2 root root 4096 Feb 3 10:09 2.7.1
    drwxr-xr-x. 2 root root 4096 Feb 3 11:58 2.7.3

  3. Forcing Druid to run with mapreduce instead of YARN
    I’ve tried using
    “hadoop.mapreduce.job.classloader”: “true”
    and (not at the same time as suggested in the documentation http://druid.io/docs/latest/operations/other-hadoop.html)
    “mapreduce.job.user.classpath.first”: “true”

  4. core-site.xml, hdfs-site.xml, mapred-site.xml or yarn-site.xml is not in /_common/
    The node I’m running Druid on is registered with Ambari and I am copying the *.xml generated by Ambari from /usr/hdp/ to the _common directory
    bash-4.1$ ls -al /etc/druid/_common/
    total 68
    drwxr-xr-x. 2 root root 4096 Feb 6 08:39 .
    drwxr-xr-x. 8 root root 4096 Feb 1 12:05 …
    -rw-r–r--. 1 root root 1223 Feb 6 08:39 common.runtime.properties
    -rw-r–r--. 1 root root 8151 Feb 6 08:41 core-site.xml
    -rw-r–r--. 1 root root 10882 Feb 6 08:41 hdfs-site.xml
    -rw-r–r--. 1 root root 375 Feb 1 12:04 log4j2.xml
    -rw-r–r--. 1 root root 7880 Feb 6 08:42 mapred-site.xml
    -rw-r–r--. 1 root root 23617 Feb 6 08:41 yarn-site.xml

    /opt/druid/bin/node.sh is configured to use /etc/druid for all Druid services:
    CONF_DIR="${DRUID_CONF_DIR:=/etc/druid}"

My struggles appear similar to this post in particular:
https://groups.google.com/forum/#!topic/druid-user/U9XjCiCXyDs

I know that Druid is picking up the 2.7.3 hadoop extensions, I can see them loading:

added URL[file:/opt/druid/hadoop-dependencies/hadoop-client/2.7.3/hadoop-mapreduce-client-app-2.7.3.jar]

Here are the Druid io.druid.guice.JsonConfigurator container settings for the indexing containers:

Loaded class[class io.druid.guice.ExtensionsConfig] from props[druid.extensions.] as [ExtensionsConfig{searchCurrentClassloader=true, directory='/opt/druid/extensions', hadoopDependenciesDir='/opt/druid/hadoop-dependencies', hadoopContainerDruidClasspath='null', loadList=[druid-hdfs-storage, mysql-metadata-storage, druid-histogram, druid-stats]}]

When I submit an indexing job via curl, I get all the way to the rm1,rm2 failover point.
When I submit via CLI, I get the following (which seems to suggest an issue with the MySQL-metadata connector):
java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath /opt/druid/lib/*: io.druid.cli.Main index hadoop /opt/druid/quickstart/wikiticker-index-dan.json
2017-02-06T15:15:55,889 ERROR [main] io.druid.cli.CliHadoopIndexer - failure!!!
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_73]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_73]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_73]
at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_73]
at io.druid.cli.CliHadoopIndexer.run(CliHadoopIndexer.java:116) [druid-services-0.9.2.jar:0.9.2]
at io.druid.cli.Main.main(Main.java:106) [druid-services-0.9.2.jar:0.9.2]
Caused by: java.lang.ExceptionInInitializerError
at io.druid.cli.CliInternalHadoopIndexer.getHadoopDruidIndexerConfig(CliInternalHadoopIndexer.java:161) ~[druid-services-0.9.2.jar:0.9.2]
at io.druid.cli.CliInternalHadoopIndexer$1.configure(CliInternalHadoopIndexer.java:89) ~[druid-services-0.9.2.jar:0.9.2]
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340) ~[guice-4.1.0.jar:?]
at com.google.inject.spi.Elements.getElements(Elements.java:110) ~[guice-4.1.0.jar:?]
at com.google.inject.util.Modules$OverrideModule.configure(Modules.java:198) ~[guice-4.1.0.jar:?]
at com.google.inject.AbstractModule.configure(AbstractModule.java:62) ~[guice-4.1.0.jar:?]
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340) ~[guice-4.1.0.jar:?]
at com.google.inject.spi.Elements.getElements(Elements.java:110) ~[guice-4.1.0.jar:?]
at com.google.inject.util.Modules$OverrideModule.configure(Modules.java:177) ~[guice-4.1.0.jar:?]
at com.google.inject.AbstractModule.configure(AbstractModule.java:62) ~[guice-4.1.0.jar:?]
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340) ~[guice-4.1.0.jar:?]
at com.google.inject.spi.Elements.getElements(Elements.java:110) ~[guice-4.1.0.jar:?]
at com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138) ~[guice-4.1.0.jar:?]
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104) ~[guice-4.1.0.jar:?]
at com.google.inject.Guice.createInjector(Guice.java:99) ~[guice-4.1.0.jar:?]
at com.google.inject.Guice.createInjector(Guice.java:73) ~[guice-4.1.0.jar:?]
at com.google.inject.Guice.createInjector(Guice.java:62) ~[guice-4.1.0.jar:?]
at io.druid.initialization.Initialization.makeInjectorWithModules(Initialization.java:366) ~[druid-server-0.9.2.jar:0.9.2]
at io.druid.cli.GuiceRunnable.makeInjector(GuiceRunnable.java:62) ~[druid-services-0.9.2.jar:0.9.2]
at io.druid.cli.CliInternalHadoopIndexer.run(CliInternalHadoopIndexer.java:108) ~[druid-services-0.9.2.jar:0.9.2]
at io.druid.cli.Main.main(Main.java:106) ~[druid-services-0.9.2.jar:0.9.2]
… 6 more
Caused by: com.google.inject.CreationException: Unable to create injector, see the following errors:

  1. A binding to com.google.common.base.Supplier<io.druid.server.audit.SQLAuditManagerConfig> was already configured at io.druid.guice.JsonConfigProvider.bind(JsonConfigProvider.java:131) (via modules: com.google.inject.util.Modules$OverrideModule -> io.druid.metadata.storage.mysql.MySQLMetadataStorageModule).
    at io.druid.guice.JsonConfigProvider.bind(JsonConfigProvider.java:131) (via modules: com.google.inject.util.Modules$OverrideModule -> io.druid.metadata.storage.postgresql.PostgreSQLMetadataStorageModule)

  2. A binding to io.druid.server.audit.SQLAuditManagerConfig was already configured at io.druid.guice.JsonConfigProvider.bind(JsonConfigProvider.java:132) (via modules: com.google.inject.util.Modules$OverrideModule -> io.druid.metadata.storage.mysql.MySQLMetadataStorageModule).
    at io.druid.guice.JsonConfigProvider.bind(JsonConfigProvider.java:132) (via modules: com.google.inject.util.Modules$OverrideModule -> io.druid.metadata.storage.postgresql.PostgreSQLMetadataStorageModule)

2 errors
at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:470) ~[guice-4.1.0.jar:?]
at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:155) ~[guice-4.1.0.jar:?]
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:107) ~[guice-4.1.0.jar:?]
at com.google.inject.Guice.createInjector(Guice.java:99) ~[guice-4.1.0.jar:?]
at com.google.inject.Guice.createInjector(Guice.java:73) ~[guice-4.1.0.jar:?]
at com.google.inject.Guice.createInjector(Guice.java:62) ~[guice-4.1.0.jar:?]
at io.druid.initialization.Initialization.makeInjectorWithModules(Initialization.java:366) ~[druid-server-0.9.2.jar:0.9.2]
at io.druid.indexer.HadoopDruidIndexerConfig.(HadoopDruidIndexerConfig.java:99) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
at io.druid.cli.CliInternalHadoopIndexer.getHadoopDruidIndexerConfig(CliInternalHadoopIndexer.java:161) ~[druid-services-0.9.2.jar:0.9.2]
at io.druid.cli.CliInternalHadoopIndexer$1.configure(CliInternalHadoopIndexer.java:89) ~[druid-services-0.9.2.jar:0.9.2]
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340) ~[guice-4.1.0.jar:?]
at com.google.inject.spi.Elements.getElements(Elements.java:110) ~[guice-4.1.0.jar:?]
at com.google.inject.util.Modules$OverrideModule.configure(Modules.java:198) ~[guice-4.1.0.jar:?]
at com.google.inject.AbstractModule.configure(AbstractModule.java:62) ~[guice-4.1.0.jar:?]
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340) ~[guice-4.1.0.jar:?]
at com.google.inject.spi.Elements.getElements(Elements.java:110) ~[guice-4.1.0.jar:?]
at com.google.inject.util.Modules$OverrideModule.configure(Modules.java:177) ~[guice-4.1.0.jar:?]
at com.google.inject.AbstractModule.configure(AbstractModule.java:62) ~[guice-4.1.0.jar:?]
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340) ~[guice-4.1.0.jar:?]
at com.google.inject.spi.Elements.getElements(Elements.java:110) ~[guice-4.1.0.jar:?]
at com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138) ~[guice-4.1.0.jar:?]
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104) ~[guice-4.1.0.jar:?]
at com.google.inject.Guice.createInjector(Guice.java:99) ~[guice-4.1.0.jar:?]
at com.google.inject.Guice.createInjector(Guice.java:73) ~[guice-4.1.0.jar:?]
at com.google.inject.Guice.createInjector(Guice.java:62) ~[guice-4.1.0.jar:?]
at io.druid.initialization.Initialization.makeInjectorWithModules(Initialization.java:366) ~[druid-server-0.9.2.jar:0.9.2]
at io.druid.cli.GuiceRunnable.makeInjector(GuiceRunnable.java:62) ~[druid-services-0.9.2.jar:0.9.2]
at io.druid.cli.CliInternalHadoopIndexer.run(CliInternalHadoopIndexer.java:108) ~[druid-services-0.9.2.jar:0.9.2]
at io.druid.cli.Main.main(Main.java:106) ~[druid-services-0.9.2.jar:0.9.2]
… 6 more

I’m using mysql-metadata-storage-0.9.2.jar and mysql-connector-java-5.1.38.jar for my MySQL-metadata extension

Here are the configs for my common.runtime.properties (where HOST, PRINCIPAL and KEYTAB are sanitized for public thread posting)
bash-4.1$ cat /etc/druid/_common/common.runtime.properties
database_name=druid
druid.discovery.curator.path=/druid/discovery
druid.extensions.directory=/opt/druid/extensions
druid.extensions.hadoopDependenciesDir=/opt/druid/hadoop-dependencies
#druid.extensions.hadoopContainerDruidClasspath=/usr/hdp/2.5.3.0-37/hadoop/conf
druid.extensions.loadList=[“druid-hdfs-storage”, “mysql-metadata-storage”, “druid-histogram”, “druid-stats”]
druid.extensions.pullList=
druid.host=HOST
druid.indexer.logs.directory=/druid/greenfield-stage/logs
druid.indexer.logs.type=hdfs
druid.metadata.storage.connector.connectURI=jdbc:mysql://HOST:3306/druid?createDatabaseIfNotExist=true
druid.metadata.storage.connector.password=druid
druid.metadata.storage.connector.port=3306
druid.metadata.storage.connector.user=druid
druid.metadata.storage.type=mysql
druid.selectors.coordinator.serviceName=druid/coordinator
druid.selectors.indexing.serviceName=druid/overlord
druid.storage.storageDirectory=/druid/greenfield-stage/data
druid.storage.type=hdfs
druid.zk.paths.base=/druid
druid.zk.service.host=HOST:2181
druid.hadoop.security.kerberos.principal=PRINCIPAL
druid.hadoop.security.kerberos.keytab=KEYTAB

Here are the configs for my middleManager:
bash-4.1$ cat /etc/druid/middleManager/runtime.properties
druid.emitter=logging
druid.indexer.runner.javaOpts=-server -Xmx8g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dhdp.version=2.5.3.0-37 -Dhadoop.mapreduce.job.classloader=true
druid.indexer.runner.startPort=8100
druid.indexer.task.baseTaskDir=/tmp/persistent/tasks
druid.indexer.task.hadoopWorkingPath=/tmp/druid-indexing
druid.monitoring.emissionPeriod=PT1m
druid.monitoring.monitors=[“io.druid.server.metrics.EventReceiverFirehoseMonitor”,“io.druid.client.cache.CacheMonitor”,“com.metamx.metrics.JvmMonitor”]
druid.port=8091
druid.processing.buffer.sizeBytes=536870912
druid.processing.numThreads=2
druid.server.http.numThreads=50
druid.service=druid/middlemanager
druid.worker.capacity=80

Are there known compatibility issues with Druid 0.9.2 and HDP 2.5.X.X (Hadoop 2.7.3)? My issue seems similar to the many cases posted here relating to parsing the *.xml files properly and loading proper hadoop-extensions version.

Thanks for assistance in advance!

-Dan

We have resolved this issue, I’m following up with the results of our troubleshooting.

Although a portion of the YARN indexing issue we had was related to the hadoop-client version not being correct for the version of HDP we were indexing with, this was not the entire problem.

With some help from the Druid community, we identified another incompatible component that also needed to be compiled for our specific version of HDP. The druid-hdfs-storage extension was colliding with our hadoop-client giving us jackson parsing errors in the mapreduce jobs.

Compiling both the druid-hdfs-storage extension and the hadoop-client libraries for HDP 2.5.3.0 completely resolved our YARN indexing issues.

-Dan

Can you share the details of what components you used to build your hdfs-stoage? In HDP 2.5.3, jackson 2.2.3 is used. I updated the POM and ran mvn clean package. There are some compatiability with an binding annotation used:

[ERROR] /home/micbui/druid/common/src/main/java/io/druid/guice/GuiceAnnotationIntrospector.java:[44,35] cannot find symbol
symbol: method annotations()
location: variable m of type com.fasterxml.jackson.databind.introspect.AnnotatedMember
[

Can you please describe in more detail what exactly involves recompile the druid extension and client libraries?
I tried to recompile druid with older version of jackson, but failed to run a YARN job.