Unable to resolve the Jackson library incompatibility issue

Hello, I am running druid 0.9.0 on an Azure Cluster with HDP insight 2.4.1.1-3. The hadoop client is 2.7.1.
After countless attempts to solve the issue with Jackson, specifically :

Error: class com.fasterxml.jackson.datatype.guava.deser.HostAndPortDeserializer overrides final method deserialize.

I 've tried all workarounds documented here

https://github.com/druid-io/druid/blob/master/docs/content/operations/other-hadoop.md

to no avail.

The recompilation was unsuccessful after shading the jackson dependency as suggested by another user facing the same issue.

The last case i tried was adding the

“mapreduce.job.user.classpath.first”: “true”
to the jobProperties property of of my indexing task with the following results:

Diagnostics: Exception from container-launch.
Container id: container_e02_1461544451524_0047_05_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
at org.apache.hadoop.util.Shell.run(Shell.java:487)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:371)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:303)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Shell output: main : command provided 1
main : run as user is nobody
main : requested yarn user is druid

We are trying to run an indexing job.

Any help will be appreciated. Has anyone had any success in the same setup?

I think your best bet is to run a dependency dump of your particular version of Hadoop and recompile Druid with whatever versions of Jackson or Hadoop it actually uses.

Hi all,

Great project.

I hate to ask a naive question, but came someone point me to a reference or guide on how to do such a thing? We are interested in the exact same use-case (Azure-hosted druid married with HDInsight.)

Thanks.

Azure shoudl be supported via the community extensions:
http://druid.io/docs/0.9.0/development/extensions.html

Simply include the Azure extension.