No buckets?? seems there is no data to index

hdfs data is in the interval of config, but still failed, what other reasons,thanks?


**stack:**

Caused by: java.lang.RuntimeException: java.lang.RuntimeException: No buckets?? seems there is no data to index.
	at io.druid.indexer.IndexGeneratorJob.run(IndexGeneratorJob.java:211) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:321) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]
	at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:96) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:259) ~[druid-indexing-service-0.8.3.jar:0.8.3]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_51]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_51]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_51]
	at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_51]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:135) ~[druid-indexing-service-0.8.3.jar:0.8.3]
	... 7 more
Caused by: java.lang.RuntimeException: No buckets?? seems there is no data to index.
	at io.druid.indexer.IndexGeneratorJob.run(IndexGeneratorJob.java:160) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:321) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]
	at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:96) ~[druid-indexing-hadoop-0.8.3.jar:0.8.3]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:259) ~[druid-indexing-service-0.8.3.jar:0.8.3]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_51]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_51]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_51]
	at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_51]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:135) ~[druid-indexing-service-0.8.3.jar:0.8

wikipedia_index_hadoop_task.json:

{

“type”: “index_hadoop”,

“spec”: {

“dataSchema”: {

“dataSource”: “wikipedia”,

“parser”: {

“type”: “string”,

“parseSpec”: {

“format”: “json”,

“timestampSpec”: {

“column”: “timestamp”,

“format”: “auto”

},

“dimensionsSpec”: {

“dimensions”: [

“page”,

“language”,

“user”,

“unpatrolled”,

“newPage”,

“robot”,

“anonymous”,

“namespace”,

“continent”,

“country”,

“region”,

“city”

],

“dimensionExclusions”: ,

“spatialDimensions”:

}

}

},

“metricsSpec”: [

{

“type”: “count”,

“name”: “count”

},

{

“type”: “doubleSum”,

“name”: “added”,

“fieldName”: “added”

},

{

“type”: “doubleSum”,

“name”: “deleted”,

“fieldName”: “deleted”

},

{

“type”: “doubleSum”,

“name”: “delta”,

“fieldName”: “delta”

}

],

“granularitySpec”: {

“type”: “uniform”,

“segmentGranularity”: “DAY”,

“queryGranularity”: “NONE”,

“intervals”: [“2016-03-21/2016-03-23”]

}

},

“ioConfig”: {

“type”: “hadoop”,

“inputSpec”: {

“type”: “static”,

“paths”: “/user/services/druid/input/wikipedia_data.json”

}

},

“tuningConfig”: {

“type”: “hadoop”,

“partitionsSpec”: {

“targetPartitionSize”: 5000000

},

“jobProperties” : {“mapreduce.job.user.classpath.first”: “true”}

}

},

“hadoopCoordinates” : “org.apache.hadoop:hadoop-client:2.5.0-cdh5.3.2”

}

hdfs data:

{“timestamp”: “2016-03-22T01:02:33Z”, “page”: “Gypsy Danger”, “language” : “en”, “user” : “nuclear”, “unpatrolled” : “true”, “newPage” : “true”, “robot”: “false”, “anonymous”: “false”, “namespace”:“article”, “continent”:“North America”, “country”:“United States”, “region”:“Bay Area”, “city”:“San Francisco”, “added”: 57, “deleted”: 200, “delta”: -143}

{“timestamp”: “2016-03-22T02:32:45Z”, “page”: “Striker Eureka”, “language” : “en”, “user” : “speed”, “unpatrolled” : “false”, “newPage” : “true”, “robot”: “true”, “anonymous”: “false”, “namespace”:“wikipedia”, “continent”:“Australia”, “country”:“Australia”, “region”:“Cantebury”, “city”:“Syndey”, “added”: 459, “deleted”: 129, “delta”: 330}

{“timestamp”: “2016-03-22T03:11:21Z”, “page”: “Cherno Alpha”, “language” : “ru”, “user” : “masterYi”, “unpatrolled” : “false”, “newPage” : “true”, “robot”: “true”, “anonymous”: “false”, “namespace”:“article”, “continent”:“Asia”, “country”:“Russia”, “region”:“Oblast”, “city”:“Moscow”, “added”: 123, “deleted”: 12, “delta”: 111}

{“timestamp”: “2016-03-22T04:58:39Z”, “page”: “Crimson Typhoon”, “language” : “zh”, “user” : “triplets”, “unpatrolled” : “true”, “newPage” : “false”, “robot”: “true”, “anonymous”: “false”, “namespace”:“wikipedia”, “continent”:“Asia”, “country”:“China”, “region”:“Shanxi”, “city”:“Taiyuan”, “added”: 905, “deleted”: 5, “delta”: 900}

{“timestamp”: “2016-03-22T08:41:27Z”, “page”: “Coyote Tango”, “language” : “ja”, “user” : “cancer”, “unpatrolled” : “true”, “newPage” : “false”, “robot”: “true”, “anonymous”: “false”, “namespace”:“wikipedia”, “continent”:“Asia”, “country”:“Japan”, “region”:“Kanto”, “city”:“Tokyo”, “added”: 1, “deleted”: 10, “delta”: -9}

{“timestamp”: “2016-03-22T08:42:27Z”, “page”: “Coyote Tango”, “language” : “ja”, “user” : “cancer”, “unpatrolled” : “true”, “newPage” : “false”, “robot”: “true”, “anonymous”: “false”, “namespace”:“wikipedia”, “continent”:“Asia”, “country”:“Japan”, “region”:“Kanto”, “city”:“Tokyo”, “added”: 1, “deleted”: 10, “delta”: -9}

Maybe it is just about timezone settings.

在 2016年3月23日星期三 UTC+8下午8:41:20,Felix Cui写道:

thanks a lot.

it is indeed timezone not correct in mr job.

after set hadoop job properties (“mapreduce.map.java.opts”:"-Duser.timezone=UTC -Dfile.encoding=UTF-8", “mapreduce.reduce.java.opts”:"-Duser.timezone=UTC -Dfile.encoding=UTF-8"), batch injection job with indexing services run success.

在 2016年3月24日星期四 UTC+8下午1:51:23,Ninglin Du写道: