Batch Data Injection "Field Type Error"

I’m trying to inject data to druid by posting to overlord node. The data json is like

{“ts”:“2015-05-20T04:49:21Z”, “m_nights_booked”: 1}

{“ts”:“2015-05-20T00:02:40Z”, “m_nights_booked”: 1}

and the task json file is like

“type”: “index”,

“spec”: {

“dataSchema”: {

“dataSource”: “airbnb”,

“parser”: {

“type”: “string”,

“parseSpec”: {

“format”: “json”,

“timestampSpec”: {

“column”: “ts”,

“format”: “auto”

},

“dimensionsSpec”: {

“dimensions”: [

“m_nights_booked”,

“ts”

],

“dimensionExclusions”: ,

“spatialDimensions”:

}

}

},

“metricsSpec”: [

{

“type”: “count”,

“name”: “count”

},

{

“type”: “doubleSum”,

“name”: “m_nights_booked”,

“fieldName”: “m_nights_booked”

}

],

“granularitySpec”: {

“type”: “uniform”,

“segmentGranularity”: “HOUR”,

“queryGranularity”: “NONE”,

“intervals”: [“2015-05-20/2015-05-21”]

}

},

“ioConfig”: {

“type”: “index”,

“firehose”: {

“type”: “local”,

“baseDir”: “examples/qi/”,

“filter”: “air_test_data.json”

}

},

“tuningConfig”: {

“type”: “index”,

“targetPartitionSize”: 0,

“rowFlushBoundary”: 0

}

}

}

but the injection always failed and the log says

com.metamx.common.ISE: Cannot merge columns of type[STRING] and [FLOAT]

at io.druid.segment.column.ColumnCapabilitiesImpl.merge(ColumnCapabilitiesImpl.java:124) ~[druid-processing-0.7.1.1.jar:0.7.1.1]

at io.druid.segment.IndexMaker.makeIndexFiles(IndexMaker.java:439) ~[druid-processing-0.7.1.1.jar:0.7.1.1]

at io.druid.segment.IndexMaker.merge(IndexMaker.java:329) ~[druid-processing-0.7.1.1.jar:0.7.1.1]

at io.druid.segment.IndexMaker.persist(IndexMaker.java:184) ~[druid-processing-0.7.1.1.jar:0.7.1.1]

at io.druid.segment.IndexMaker.persist(IndexMaker.java:151) ~[druid-processing-0.7.1.1.jar:0.7.1.1]

at io.druid.segment.IndexMaker.persist(IndexMaker.java:132) ~[druid-processing-0.7.1.1.jar:0.7.1.1]

at io.druid.indexing.common.index.YeOldePlumberSchool$1.spillIfSwappable(YeOldePlumberSchool.java:206) ~[druid-indexing-service-0.7.1.1.jar:0.7.1.1]

at io.druid.indexing.common.index.YeOldePlumberSchool$1.persist(YeOldePlumberSchool.java:136) ~[druid-indexing-service-0.7.1.1.jar:0.7.1.1]

at io.druid.indexing.common.task.IndexTask.generateSegment(IndexTask.java:378) ~[druid-indexing-service-0.7.1.1.jar:0.7.1.1]

at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:186) ~[druid-indexing-service-0.7.1.1.jar:0.7.1.1]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:235) [druid-indexing-service-0.7.1.1.jar:0.7.1.1]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:214) [druid-indexing-service-0.7.1.1.jar:0.7.1.1]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_25]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_25]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_25]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_25]

and when i remove the following metric from the metricsSpec section, everything works. I don’t quite get it, shouldn’t the m_nights_booked field in my json data is float type?

{

“type”: “doubleSum”,

“name”: “m_nights_booked”,

“fieldName”: “m_nights_booked”

}

Hi Qi, what version of Druid is this? I am looking into this issue right now and will report back shortly.

As a very quick workaround, can you remove “m_nights_booked” from your list of dimensions and try again?

“dimensionsSpec”: {

“dimensions”: [

"m_nights_booked",

“ts”

],

Hi Qi,

So looking at your schema and data, the problem is that you’ve listed the m_nights_booked field as both a dimension and a metric. In your case, I believe you want that field to be a metric and removing it from the list of dimensions should resolve your problem.

I’ve created PR https://github.com/druid-io/druid/pull/1399/files to add better messages for this use case.

– FJ

it works! thanks fangjin yang!