Ingest data from hadoop to druid failed

Hi
I am using druid0.9.0 cluster (3 nodes) hadoop 2.3.0 and mysql, when I ingest the data from hadoop to druid using overload (indexing node), there is a error print and the task always runs failed.

The related files are in attachment.

the task json file is : wikiticker-index.json

the common config is : common.runtime.properties

the task error is: index_hadoop_wikiticker_2016-04-25T06_26_59.682Z.txt

I can see the map-reduce task run successfully. Could you find any configuration error?

Thank you

-------------------error log in ---------------------

2016-04-25T06:27:58,446 INFO [task-runner-0-priority-0] io.druid.indexing.common.task.HadoopIndexTask - Starting a hadoop index generator job…

2016-04-25T06:27:58,466 INFO [task-runner-0-priority-0] io.druid.indexer.path.StaticPathSpec - Adding paths[/test/my-sample.json]

2016-04-25T06:27:58,469 INFO [task-runner-0-priority-0] io.druid.indexer.HadoopDruidIndexerJob - No metadataStorageUpdaterJob set in the config. This is cool if you are running a hadoop index task, otherwise nothing will be uploaded to database.

2016-04-25T06:27:58,494 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_wikiticker_2016-04-25T06:26:59.682Z, type=index_hadoop, dataSource=wikiticker}]

java.lang.RuntimeException: java.lang.reflect.InvocationTargetException

at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:160) ~[druid-indexing-service-0.9.0.jar:0.9.0]

at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:208) ~[druid-indexing-service-0.9.0.jar:0.9.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:338) [druid-indexing-service-0.9.0.jar:0.9.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:318) [druid-indexing-service-0.9.0.jar:0.9.0]

at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_79]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_79]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_79]

at java.lang.Thread.run(Thread.java:745) [?:1.7.0_79]

Caused by: java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_79]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_79]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_79]

at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_79]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]

… 7 more

Caused by: java.lang.RuntimeException: java.lang.RuntimeException: No buckets?? seems there is no data to index.

at io.druid.indexer.IndexGeneratorJob.run(IndexGeneratorJob.java:211) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

at io.druid.indexer.JobHelper.runJobs(JobHelper.java:323) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:94) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:261) ~[druid-indexing-service-0.9.0.jar:0.9.0]

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_79]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_79]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_79]

at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_79]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]

… 7 more

Caused by: java.lang.RuntimeException: No buckets?? seems there is no data to index.

at io.druid.indexer.IndexGeneratorJob.run(IndexGeneratorJob.java:172) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

at io.druid.indexer.JobHelper.runJobs(JobHelper.java:323) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:94) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:261) ~[druid-indexing-service-0.9.0.jar:0.9.0]

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_79]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_79]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_79]

at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_79]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]

… 7 more

2016-04-25T06:27:58,512 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {

“id” : “index_hadoop_wikiticker_2016-04-25T06:26:59.682Z”,

“status” : “FAILED”,

“duration” : 51174

}

2016-04-25T06:27:58,520 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.coordination.AbstractDataSegmentAnnouncer.stop()] on object[io.druid.server.coordination.BatchDataSegmentAnnouncer@5d7a4ab].

index_hadoop_wikiticker_2016-04-25T06_26_59.682Z.txt (122 KB)

common.runtime.properties (3.86 KB)

wikiticker-index.json (1.88 KB)

I suffered this problem today,too. I used imply-1.2.0 cluster with 4 nodes(1 master node,2 data nodes, 1 query node) and ran always failed…

在 2016年4月25日星期一 UTC+8下午3:03:20,Gary Wu写道:

This error means the data you are trying to index does not match the “intervals” object you provided. The easiest fix is to make sure you are running on UTC timezone everywhere and your data is UTC timezone as well.

Hi Fangjin,
Yes, it works for my environment. I change the cluster timezone to UTC, and I also adjust the time interval. It can ingest the data successfully,
Thanks a lot.

Hi Gary,

Could you please share me how do you change cluster timezone. My cluster (Unix date) is already in UTC. Also how to adjust the time interval. I experience the same problem as yours. I can’t run quickstart code so I don’t know whether my cluster is working or not.

Best Regards,

Kamolphan