Druid-0.9.0-rc2 batch ingestion failing

Hi, I’m trying to ingest some json filess from this tutorial - http://druid.io/docs/0.9.0-rc2/ingestion/batch-ingestion.html. I finally got the job submission to work, but apparently my mapreduce jobs are failing. I can see some suspicious errors regarding fasterxml in the logs. Relevant part here:

2016-03-09T15:00:38,906 INFO [task-runner-0-priority-0] io.druid.indexing.common.task.HadoopIndexTask - Starting a hadoop index generator job…
2016-03-09T15:00:38,929 INFO [task-runner-0-priority-0] io.druid.indexer.path.StaticPathSpec - Adding paths[/bix/srv-nice/tmp/wikipedia_data.json]
2016-03-09T15:00:38,933 INFO [task-runner-0-priority-0] io.druid.indexer.HadoopDruidIndexerJob - No metadataStorageUpdaterJob set in the config. This is cool if you are running a hadoop index task, otherwise nothing will be uploaded to database.
2016-03-09T15:00:38,962 INFO [task-runner-0-priority-0] io.druid.indexer.path.StaticPathSpec - Adding paths[/bix/srv-nice/tmp/wikipedia_data.json]
11.007: [GC pause (young), 0.0237470 secs]
[Parallel Time: 12.9 ms, GC Workers: 33]
[GC Worker Start (ms): Min: 11006.9, Avg: 11007.3, Max: 11007.6, Diff: 0.7]
[Ext Root Scanning (ms): Min: 0.3, Avg: 1.6, Max: 3.9, Diff: 3.6, Sum: 53.6]
[Update RS (ms): Min: 0.0, Avg: 0.1, Max: 0.5, Diff: 0.5, Sum: 4.8]
[Processed Buffers: Min: 0, Avg: 4.4, Max: 23, Diff: 23, Sum: 145]
[Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 3.5]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
[Object Copy (ms): Min: 5.4, Avg: 8.4, Max: 11.0, Diff: 5.5, Sum: 275.6]
[Termination (ms): Min: 0.0, Avg: 1.9, Max: 2.8, Diff: 2.8, Sum: 63.1]
[GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 2.0]
[GC Worker Total (ms): Min: 11.8, Avg: 12.2, Max: 12.6, Diff: 0.8, Sum: 402.8]
[GC Worker End (ms): Min: 11019.4, Avg: 11019.5, Max: 11019.6, Diff: 0.1]

[Code Root Fixup: 0.1 ms]

[Code Root Migration: 0.2 ms]
[Clear CT: 1.6 ms]
[Other: 9.0 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 7.3 ms]
[Ref Enq: 0.1 ms]
[Free CSet: 1.3 ms]
[Eden: 465.0M(465.0M)->0.0B(1185.0M) Survivors: 24.0M->43.0M Heap: 517.1M(2048.0M)->70.9M(2048.0M)]
[Times: user=0.21 sys=0.03, real=0.02 secs]
2016-03-09T15:00:40,604 INFO [task-runner-0-priority-0] io.druid.indexer.JobHelper - Uploading jar to path[var/druid/hadoop-tmp/wikipedia/2016-03-09T150028.899+0100/0f31e412ad9449aea06fec26d674d99d/classpath/mysql-metadata-storage-0.9.0-SNAPSHOT.jar]
2016-03-09T15:00:41,295 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://myserv:8188/ws/v1/timeline/
2016-03-09T15:00:41,621 WARN [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobSubmitter - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2016-03-09T15:00:41,628 WARN [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2016-03-09T15:00:42,169 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2016-03-09T15:00:42,293 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2016-03-09T15:00:42,465 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1456233729853_23001
2016-03-09T15:00:42,602 INFO [task-runner-0-priority-0] org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2016-03-09T15:00:44,035 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1456233729853_23001
2016-03-09T15:00:44,080 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - The url to track the job: http://myservt:8088/proxy/application_1456233729853_23001/
2016-03-09T15:00:44,081 INFO [task-runner-0-priority-0] io.druid.indexer.IndexGeneratorJob - Job wikipedia-index-generator-Optional.of([2013-08-31T00:00:00.000+02:00/2013-09-01T00:00:00.000+02:00]) submitted, status available at http://mysrv:8088/prox
y/application_1456233729853_23001/
2016-03-09T15:00:44,082 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Running job: job_1456233729853_23001
2016-03-09T15:01:00,207 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_1456233729853_23001 running in uber mode : false
2016-03-09T15:01:00,209 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 0% reduce 0%
2016-03-09T15:01:15,532 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1456233729853_23001_m_000000_0, Status : FAILED
Error: com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z
2016-03-09T15:01:27,617 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1456233729853_23001_m_000000_1, Status : FAILED
Error: com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z
2016-03-09T15:01:33,886 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125
2016-03-09T15:01:46,721 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1456233729853_23001_m_000000_2, Status : FAILED
Error: com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z
2016-03-09T15:01:53,756 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 100%
2016-03-09T15:01:54,771 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_1456233729853_23001 failed with state FAILED due to: Task failed task_1456233729853_23001_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
2016-03-09T15:01:54,898 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 12
Job Counters
Failed map tasks=4
Launched map tasks=4
Other local map tasks=3
Rack-local map tasks=1
Total time spent by all maps in occupied slots (ms)=43304
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=43304
Total vcore-seconds taken by all map tasks=43304
Total megabyte-seconds taken by all map tasks=243888128
Map-Reduce Framework
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
2016-03-09T15:01:54,909 INFO [task-runner-0-priority-0] io.druid.indexer.JobHelper - Deleting path[var/druid/hadoop-tmp/wikipedia/2016-03-09T150028.899+0100/0f31e412ad9449aea06fec26d674d99d]
2016-03-09T15:01:54,938 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_wikipedia_2016-03-09T15:00:28.895+01:00, type=index_hadoop, dataSource=wikipedia}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:171) ~[druid-indexing-service-0.9.0-rc2.jar:0.9.0-rc2]
at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:208) ~[druid-indexing-service-0.9.0-rc2.jar:0.9.0-rc2]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:338) [druid-indexing-service-0.9.0-rc2.jar:0.9.0-rc2]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:318) [druid-indexing-service-0.9.0-rc2.jar:0.9.0-rc2]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_75]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_75]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_75]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_75]
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_75]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_75]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_75]
at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_75]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:168) ~[druid-indexing-service-0.9.0-rc2.jar:0.9.0-rc2]
… 7 more
Caused by: com.metamx.common.ISE: Job[class io.druid.indexer.IndexGeneratorJob] failed!
at io.druid.indexer.JobHelper.runJobs(JobHelper.java:343) ~[druid-indexing-hadoop-0.9.0-rc2.jar:0.9.0-rc2]
at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:94) ~[druid-indexing-hadoop-0.9.0-rc2.jar:0.9.0-rc2]
at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:261) ~[druid-indexing-service-0.9.0-rc2.jar:0.9.0-rc2]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_75]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_75]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_75]
at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_75]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:168) ~[druid-indexing-service-0.9.0-rc2.jar:0.9.0-rc2]
… 7 more
2016-03-09T15:01:54,952 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
“id” : “index_hadoop_wikipedia_2016-03-09T15:00:28.895+01:00”,
“status” : “FAILED”,
“duration” : 79780
}
2016-03-09T15:01:54,969 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.coordination.AbstractDataSegmentAnnouncer.stop()] on object[io.druid.server.coordination.BatchDataSegmentAnnouncer@4e1ba9a0].
2016-03-09T15:01:54,969 INFO [main] io.druid.server.coordination.AbstractDataSegmentAnnouncer - Stopping class io.druid.server.coordination.BatchDataSegmentAnnouncer with config[io.druid.server.initialization.ZkPathsConfig@22d6a3a7]
2016-03-09T15:01:54,970 INFO [main] io.druid.curator.announcement.Announcer - unannouncing [/druid/dev/announcements/mysrv:8103]
2016-03-09T15:01:54,982 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.indexing.worker.executor.ExecutorLifecycle.stop() throws java.lang.Exception] on object[io.druid.indexing.worker.executor.ExecutorLifecycle@2b53ef52].
2016-03-09T15:01:54,985 INFO [main] org.eclipse.jetty.server.ServerConnector - Stopped ServerConnector@65f8cfc6{HTTP/1.1}{0.0.0.0:8103}
2016-03-09T15:01:54,987 INFO [main] org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@2fffbd21{/,null,UNAVAILABLE}
2016-03-09T15:01:54,988 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.indexing.overlord.ThreadPoolTaskRunner.stop()] on object[io.druid.indexing.overlord.ThreadPoolTaskRunner@34611626].
2016-03-09T15:01:54,989 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.curator.discovery.ServerDiscoverySelector.stop() throws java.io.IOException] on object[io.druid.curator.discovery.ServerDiscoverySelector@21bda9bd].
2016-03-09T15:01:54,993 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.curator.announcement.Announcer.stop()] on object[io.druid.curator.announcement.Announcer@4613f33a].
2016-03-09T15:01:54,993 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.curator.discovery.ServerDiscoverySelector.stop() throws java.io.IOException] on object[io.druid.curator.discovery.ServerDiscoverySelector@485f63bc].
2016-03-09T15:01:54,993 INFO [main] io.druid.curator.CuratorModule - Stopping Curator
2016-03-09T15:01:54,996 INFO [main] org.apache.zookeeper.ZooKeeper - Session: 0x1530966791f1dc2 closed
2016-03-09T15:01:54,996 INFO [main-EventThread] org.apache.zookeeper.ClientCnxn - EventThread shut down
2016-03-09T15:01:54,996 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void com.metamx.http.client.NettyHttpClient.stop()] on object[com.metamx.http.client.NettyHttpClient@689362a0].
2016-03-09T15:01:55,018 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.namespace.NamespacedExtractionModule$NamespaceStaticConfiguration.stop()] on object[io.druid.server.namespace.NamespacedExtractionModule$NamespaceStaticConfiguration@261e0499].
2016-03-09T15:01:55,018 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void com.metamx.metrics.MonitorScheduler.stop()] on object[com.metamx.metrics.MonitorScheduler@3e27eb98].
2016-03-09T15:01:55,018 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void com.metamx.emitter.service.ServiceEmitter.close() throws java.io.IOException] on object[com.metamx.emitter.service.ServiceEmitter@490a390e].
2016-03-09T15:01:55,019 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125
2016-03-09T15:01:55,031 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.initialization.Log4jShutterDownerModule$Log4jShutterDowner.stop()] on object[io.druid.initialization.Log4jShutterDownerModule$Log4jShutterDowner@4dcafa98].
2016-03-09 15:01:55,093 Thread-2 ERROR Unable to register shutdown hook because JVM is shutting down. java.lang.IllegalStateException: Not started
at io.druid.common.config.Log4jShutdown.addShutdownCallback(Log4jShutdown.java:45)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.addShutdownCallback(Log4jContextFactory.java:273)
at org.apache.logging.log4j.core.LoggerContext.setUpShutdownHook(LoggerContext.java:256)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:216)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:145)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41)
at org.apache.logging.log4j.LogManager.getContext(LogManager.java:182)
at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:103)
at org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:43)
at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42)
at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:29)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:253)
at org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155)
at org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685)
at org.apache.hadoop.hdfs.LeaseRenewer.(LeaseRenewer.java:72)
at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:805)
at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:958)
at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:902)
at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2687)
at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2704)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
Heap
garbage-first heap total 2097152K, used 714609K [0x00000005f5a00000, 0x0000000675a00000, 0x00000007f5a00000)
region size 1024K, 671 young (687104K), 43 survivors (44032K)
compacting perm gen total 90112K, used 89811K [0x00000007f5a00000, 0x00000007fb200000, 0x0000000800000000)
the space 90112K, 99% used [0x00000007f5a00000, 0x00000007fb1b4f90, 0x00000007fb1b5000, 0x00000007fb200000)
No shared spaces configured.

My spec file is not very different from the official one:

{
  "type" : "index_hadoop",
  "spec" : {
    "dataSchema" : {
      "dataSource" : "wikipedia",
      "parser" : {
        "type" : "hadoopyString",
        "parseSpec" : {
          "format" : "json",
          "timestampSpec" : {
            "column" : "timestamp",
            "format" : "auto"
          },
          "dimensionsSpec" : {
            "dimensions": ["page","language","user","unpatrolled","newPage","robot","anonymous","namespace","continent","country","region","city"],
            "dimensionExclusions" : [],
            "spatialDimensions" : []
          }
        }
      },
      "metricsSpec" : [
        {
          "type" : "count",
          "name" : "count"
        },
        {
          "type" : "doubleSum",
          "name" : "added",
          "fieldName" : "added"
        },
        {
          "type" : "doubleSum",
          "name" : "deleted",
          "fieldName" : "deleted"
        },
        {
          "type" : "doubleSum",
          "name" : "delta",
          "fieldName" : "delta"
        }
      ],
      "granularitySpec" : {
        "type" : "uniform",
        "segmentGranularity" : "DAY",
        "queryGranularity" : "NONE",
        "intervals" : [ "2013-08-31/2013-09-01" ]
      }
    },
    "ioConfig" : {
      "type" : "hadoop",
      "inputSpec" : {
        "type" : "static",
        "paths" : "/MyDirectory/example/wikipedia_data.json"
      }
    },
    "tuningConfig" : {
      "type": "hadoop"
    }
  }
}

I can see that there have been some problems with fasterxml already - https://groups.google.com/forum/#!topic/druid-user/UM-Cgj750sY. Is it the same case this time too?

Do you know what version of jackson-core is being used?

I think this function:

Error: com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z

only exists in 2.3 or later:

https://github.com/FasterXML/jackson-core/blob/master/src/main/java/com/fasterxml/jackson/core/JsonFactory.java#L362

  • Jon

Hi Jonathan,

Hi, not sure about the jackson version, I know that the hadoop used is 2.6.0.2.2.0.0-2041. Inside hadoop/lib directory I can see:

  • jackson-core-2.2.3.jar
  • jackson-core-asl-1.9.13.jar
  • jackson-jaxrs-1.9.13.jar
  • jackson-mapper-asl-1.9.13.jar

So I assume the version of jackson-core is 2.2.3.

W dniu czwartek, 10 marca 2016 01:36:26 UTC+1 użytkownik Jonathan Wei napisał:

Sounds like that might be the source of the problem. Is this link in the thread you mentioned helpful?

https://github.com/druid-io/druid/blob/master/docs/content/operations/other-hadoop.md

I’m not personally familiar with how to resolve these sorts of hadoop dependency issues, so hopefully someone else can chime in if you’re still encountering issues.

  • Jon

Yep,

I have a configuration with druid.indexer.task.defaultHadoopCoordinates= with hadoop jars passed on classpath on nodes startup. I can see that they are being loaded by druid during the batch ingestion attempt.

During the ingestion I could see that there are some jackson dependency issues:

2016-03-09T17:12:43,745 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Running job: job_1456233729853_23126
2016-03-09T17:12:54,821 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_1456233729853_23126 running in uber mode : false
2016-03-09T17:12:54,823 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 0% reduce 0%
2016-03-09T17:13:11,355 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1456233729853_23126_m_000000_0, Status : FAILED
Error: com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z
2016-03-09T17:13:22,413 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1456233729853_23126_m_000000_1, Status : FAILED
Error: class com.fasterxml.jackson.datatype.guava.deser.HostAndPortDeserializer overrides final method deserialize.(Lcom/fasterxml/jackson/core/JsonParser;Lcom/fasterxml/jackson/databind/DeserializationContext;)Ljava/lang/Object;
2016-03-09T17:13:27,436 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1456233729853_23126_m_000000_2, Status : FAILED
Error: class com.fasterxml.jackson.datatype.guava.deser.HostAndPortDeserializer overrides final method deserialize.(Lcom/fasterxml/jackson/core/JsonParser;Lcom/fasterxml/jackson/databind/DeserializationContext;)Ljava/lang/Object;
2016-03-09T17:13:33,460 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 100%
2016-03-09T17:13:33,467 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_1456233729853_23126 failed with state FAILED due to: Task failed task_1456233729853_23126_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

So I assumed this must be something similar to CDH issue described here - http://druid.io/docs/0.9.0-rc2/operations/other-hadoop.html.

The proposed build.sbt seemed quite a bit obsolete (it was for version 0.8.1), so I modified it to the following

resolvers += “Local Maven Repository” at “file://”+Path.userHome.absolutePath+"/.m2/repository"

libraryDependencies ++= Seq(

“com.amazonaws” % “aws-java-sdk” % “1.9.23” exclude(“common-logging”, “common-logging”),

“org.joda” % “joda-convert” % “1.7”,

“joda-time” % “joda-time” % “2.7”,

“io.druid” % “druid” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid” % “druid-services” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid” % “druid-indexing-service” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid” % “druid-indexing-hadoop” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “mysql-metadata-storage” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “druid-histogram” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “druid-hdfs-storage” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “druid-avro-extensions” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “druid-datasketches” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “druid-namespace-lookup” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“com.fasterxml.jackson.core” % “jackson-annotations” % “2.3.0”,

“com.fasterxml.jackson.core” % “jackson-core” % "2.3.0,

“com.fasterxml.jackson.core” % “jackson-databind” % “2.3.0”,

“com.fasterxml.jackson.datatype” % “jackson-datatype-guava” % “2.3.0”,

“com.fasterxml.jackson.datatype” % “jackson-datatype-joda” % "2.3.0,

“com.fasterxml.jackson.jaxrs” % “jackson-jaxrs-base” % “2.3.0”,

“com.fasterxml.jackson.jaxrs” % “jackson-jaxrs-json-provider” % "2.3.0,

“com.fasterxml.jackson.jaxrs” % “jackson-jaxrs-smile-provider” % “2.3.0”,

“com.fasterxml.jackson.module” % “jackson-module-jaxb-annotations” % “2.3.0”,

“com.sun.jersey” % “jersey-servlet” % “1.17.1”,

“mysql” % “mysql-connector-java” % “5.1.34”,

“org.scalatest” %% “scalatest” % “2.2.3” % “test”,

“org.mockito” % “mockito-core” % “1.10.19” % “test”

)

assemblyMergeStrategy in assembly := {

case path if path contains “pom.” => MergeStrategy.first

case path if path contains “javax.inject.Named” => MergeStrategy.first

case path if path contains “mime.types” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/impl/SimpleLog.class” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/impl/SimpleLog$1.class” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/impl/NoOpLog.class” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/LogFactory.class” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/LogConfigurationException.class” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/Log.class” => MergeStrategy.first

case path if path contains “META-INF/jersey-module-version” => MergeStrategy.first

case path if path contains “.properties” => MergeStrategy.first

case path if path contains “.class” => MergeStrategy.first

case path if path contains “.dtd” => MergeStrategy.first

case path if path contains “.xsd” => MergeStrategy.first

case path if path contains “Syntax.java” => MergeStrategy.first

case x =>

val oldStrategy = (assemblyMergeStrategy in assembly).value

oldStrategy(x)

}

``

Using this build I’ve build a fatjar with needed dependencies. I included it in my classpath and removed druid lib/* directory. Then I made sure that tmp classpath directory is clean (so as not to include old jars). Unfortunatelly, even before submitting task to YARN I got this:

016-03-10T15:48:29,587 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_wikipedia_2016-03-10T15:48:20.818+01:00, type=index_hadoop, dataSource=wikipedia}]

java.lang.RuntimeException: java.lang.reflect.InvocationTargetException

at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:171) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:175) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:338) [druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:318) [druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_75]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_75]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_75]

at java.lang.Thread.run(Thread.java:745) [?:1.7.0_75]

Caused by: java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_75]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_75]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_75]

at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_75]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:168) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

… 7 more

Caused by: java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.introspect.AnnotatedMember.annotations()Ljava/lang/Iterable;

at io.druid.guice.GuiceAnnotationIntrospector.findInjectableValueId(GuiceAnnotationIntrospector.java:44) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.findInjectableValueId(AnnotationIntrospectorPair.java:268) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._addDeserializerConstructors(BasicDeserializerFactory.java:427) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._constructDefaultValueInstantiator(BasicDeserializerFactory.java:319) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findValueInstantiator(BasicDeserializerFactory.java:263) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.buildBeanDeserializer(BeanDeserializerFactory.java:263) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:168) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:405) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:354) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:267) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:247) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:146) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.DeserializationContext.findContextualValueDeserializer(DeserializationContext.java:305) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.construct(PropertyBasedCreator.java:96) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.resolve(BeanDeserializerBase.java:414) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:298) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:247) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:146) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:322) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:2990) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:2884) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2034) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at io.druid.indexing.common.task.HadoopIndexTask$HadoopDetermineConfigInnerProcessing.runTask(HadoopIndexTask.java:278) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_75]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_75]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_75]

at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_75]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:168) ~[druid_build-assembly-0.1-SNAPSHOT.jar:0.1-SNAPSHOT]

… 7 more

``

So I thought that I’ve probably built it with incorrect jackson version. I tried changing my build.sbt to:

resolvers += “Local Maven Repository” at “file://”+Path.userHome.absolutePath+"/.m2/repository"

libraryDependencies ++= Seq(

“com.amazonaws” % “aws-java-sdk” % “1.9.23” exclude(“common-logging”, “common-logging”),

“org.joda” % “joda-convert” % “1.7”,

“joda-time” % “joda-time” % “2.7”,

“io.druid” % “druid” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid” % “druid-services” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid” % “druid-indexing-service” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid” % “druid-indexing-hadoop” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “mysql-metadata-storage” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “druid-histogram” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “druid-hdfs-storage” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “druid-avro-extensions” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “druid-datasketches” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“io.druid.extensions” % “druid-namespace-lookup” % “0.9.0-rc2” excludeAll (

ExclusionRule("org.ow2.asm"),

ExclusionRule("com.fasterxml.jackson.core"),

ExclusionRule("com.fasterxml.jackson.datatype"),

ExclusionRule("com.fasterxml.jackson.dataformat"),

ExclusionRule("com.fasterxml.jackson.jaxrs"),

ExclusionRule("com.fasterxml.jackson.module")

),

“com.fasterxml.jackson.core” % “jackson-annotations” % “2.2.3”,

“com.fasterxml.jackson.core” % “jackson-core” % “2.2.3”,

“com.fasterxml.jackson.core” % “jackson-databind” % “2.2.3”,

“com.fasterxml.jackson.datatype” % “jackson-datatype-guava” % “2.2.3”,

“com.fasterxml.jackson.datatype” % “jackson-datatype-joda” % “2.2.3”,

“com.fasterxml.jackson.jaxrs” % “jackson-jaxrs-base” % “2.2.3”,

“com.fasterxml.jackson.jaxrs” % “jackson-jaxrs-json-provider” % “2.2.3”,

“com.fasterxml.jackson.jaxrs” % “jackson-jaxrs-smile-provider” % “2.2.3”,

“com.fasterxml.jackson.module” % “jackson-module-jaxb-annotations” % “2.2.3”,

“com.sun.jersey” % “jersey-servlet” % “1.17.1”,

“mysql” % “mysql-connector-java” % “5.1.34”,

“org.scalatest” %% “scalatest” % “2.2.3” % “test”,

“org.mockito” % “mockito-core” % “1.10.19” % “test”

)

assemblyMergeStrategy in assembly := {

case path if path contains “pom.” => MergeStrategy.first

case path if path contains “javax.inject.Named” => MergeStrategy.first

case path if path contains “mime.types” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/impl/SimpleLog.class” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/impl/SimpleLog$1.class” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/impl/NoOpLog.class” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/LogFactory.class” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/LogConfigurationException.class” => MergeStrategy.first

case path if path contains “org/apache/commons/logging/Log.class” => MergeStrategy.first

case path if path contains “META-INF/jersey-module-version” => MergeStrategy.first

case path if path contains “.properties” => MergeStrategy.first

case path if path contains “.class” => MergeStrategy.first

case path if path contains “.dtd” => MergeStrategy.first

case path if path contains “.xsd” => MergeStrategy.first

case path if path contains “Syntax.java” => MergeStrategy.first

case x =>

val oldStrategy = (assemblyMergeStrategy in assembly).value

oldStrategy(x)

}

``

Now I compiled my fatjar again and included it on my classpath. This time it did not complain about fasterxml, but still it cannot submit a task to YARN getting stuck at:

2016-03-10T17:33:58,134 WARN [task-runner-0-priority-0] org.apache.hadoop.hdfs.BlockReaderLocal - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.

2016-03-10T17:33:58,140 WARN [task-runner-0-priority-0] org.apache.hadoop.hdfs.BlockReaderLocal - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.

2016-03-10T17:33:58,158 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:34:29,197 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:34:51,734 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:35:12,447 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:35:36,971 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:35:51,740 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:36:19,434 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:36:51,757 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:36:53,041 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:37:31,054 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:37:51,803 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:38:13,972 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:38:46,368 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:38:51,831 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:39:02,109 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:39:29,253 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:39:51,859 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:39:56,796 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:40:38,597 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:40:51,884 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:41:13,270 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:41:33,245 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:41:51,910 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:42:00,325 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:42:22,960 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:42:47,645 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:42:51,938 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:43:03,324 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:43:22,082 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:43:45,590 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:43:51,966 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:44:01,635 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:44:23,945 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:44:42,712 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:44:51,994 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:45:13,052 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:45:35,393 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:45:52,020 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:46:19,945 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:46:52,048 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:46:52,709 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:47:22,488 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2

2016-03-10T17:47:52,076 INFO [HttpPostEmitter-1-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://mysrv:8125

2016-03-10T17:48:03,950 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm1

2016-03-10T17:48:03,952 WARN [task-runner-0-priority-0] org.apache.hadoop.io.retry.RetryInvocationHandler - Exception while invoking class org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication. Not retrying because failovers (30) exceeded maximum allowed (30)

java.net.ConnectException: Call From mysrv/xx.xx.xx.xx to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

at sun.reflect.GeneratedConstructorAccessor75.newInstance(Unknown Source) ~[?:?]

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.7.0_75]

``

Did anyone have similar issues?

W dniu piątek, 11 marca 2016 01:14:49 UTC+1 użytkownik Jonathan Wei napisał:

Hey Nook,

This looks like a conflict between the jackson-datatype-guava jar pulled in by Druid, and the jackson-databind jar provided by your Hadoop cluster. Does it help if you set “mapreduce.job.user.classpath.first”: “true” on your hadoop job? You should be able to do that in the “jobProperties” of your Druid Hadoop task. Or, is it possible to update the version of jackson-core and jackson-databind used on your Hadoop cluster?

Either way, we’re trying to do the same thing- get your job to run with 2.4.x versions of all relevant jackson jars. Druid is built with 2.4.6.

If neither of those approaches work, another strategy is to build a fat jar for Druid that relocates all jackson jars into a different prefix (perhaps with the maven-shade-plugin).

Hi Gian,

I have tried using the default druid’s lib directory (without my own fatjar) and adding the following to my specFile:

“tuningConfig” : {

 "jobProperties" : {

     "mapreduce.job.user.classpath.first": "true"

  }

}

``

Unfortunately, I am still getting the same exact messages about jackson.

As for the Druid fatjar, I tried building it with exclusion of Druid’s jackson dependencies and including different version of jackson - I have described the issues in my previous answer.

Regarding the hadoop’s jackson version - for now it is not possible to easily upgrade our jackson, the best I can do is exclude the jar’s from my classpath, but it does not seem to solve the problem.

W dniu piątek, 11 marca 2016 21:40:43 UTC+1 użytkownik Gian Merlino napisał:

Hi,

Does the issue still exist? Would it be possible for you to try this on Druid 0.8.3 so that we know for sure there is a regression in 0.9.0? (I have reached out to you on hangout).

Thanks

Hi,
I’m facing the same issue here. The mappers fail to complete because of “NoSuchMethod” error.

I’m running DRUID: 0.9.0 ,

Hadoop - 2.6.0 ( hortonworks HDP 2.2)

I even tried downloading the hadoop 2.6 client jars into the hadoop-dependencies dir and setting the druid.indexer.task.defaultHadoopCoordinates=[“org.apache.hadoop:hadoop-client:2.6.0”] and this doesnt fix the issue either. I also tried Gian’s recommendation of setting “mapreduce.job.user.classpath.first”: “true” .

2016-05-02 21:40:02,290 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output
2016-05-02 21:40:02,300 ERROR [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoSuchMethodError: com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z
	at com.fasterxml.jackson.databind.ObjectMapper.<init>(ObjectMapper.java:457)
	at com.fasterxml.jackson.databind.ObjectMapper.<init>(ObjectMapper.java:389)
	at io.druid.jackson.DefaultObjectMapper.<init>(DefaultObjectMapper.java:45)

This is because HDP comes with Jackson 2.2 and Druid, like many other big data projects, requires Jackson 2.4.x

You can’t just downgrade Druid to use Jackson 2.2 without also making code changes. If we supported HDP, CDH and other distributions will now break.

The best workaround right now is to downgrade the version of Jackson in Druid and also make the code changes throughout the code base to make it also work with Jackson 2.2