Druid Re-index issue with data transformation

Hi,

I am trying to re-index druid datasource into another datasource.
I used the transformation for one column as given in below spec, but issue is that column is created with null value.
example:

I have created two datasource named “wikipedia” and “inline_data”.
There is a column “namespace” inside “wikipedia” with string value. When I execute the given spec with Druid web console all the data from “wikipedia” datasource ingested into “inline_data” but null comes into “event_name” column which is transformed from “namespace”.

SPEC is :

{
“type”: “index_parallel”,
“spec”: {
“ioConfig”: {
“type”: “index_parallel”,
“inputSource”: {
“type”: “druid”,
“dataSource”: “wikipedia”,
“interval”: “2016-06-27T00:00:00/2016-06-27T01:30:00”
}
},
“tuningConfig”: {
“type”: “index_parallel”,
“partitionsSpec”: {
“type”: “dynamic”
}
},
“dataSchema”: {
“dataSource”: “inline_data”,
“granularitySpec”: {
“type”: “uniform”,
“queryGranularity”: “HOUR”,
“rollup”: true,
“segmentGranularity”: “DAY”
},
“timestampSpec”: {
“column”: “__time”,
“format”: “iso”
},
“dimensionsSpec”: {
“dimensions”: [
{
“name”: “channel”,
“type”: “string”
},
{
“name”: “cityName”,
“type”: “string”
},
{
“name”: “comment”,
“type”: “string”
},
{
“name”: “countryIsoCode”,
“type”: “string”
},
{
“name”: “countryName”,
“type”: “string”
},
{
“name”: “diffUrl”,
“type”: “string”
},
{
“name”: “flags”,
“type”: “string”
},
{
“name”: “isAnonymous”,
“type”: “string”
},
{
“name”: “isMinor”,
“type”: “string”
},
{
“name”: “isNew”,
“type”: “string”
},
{
“name”: “isRobot”,
“type”: “string”
},
{
“name”: “isUnpatrolled”,
“type”: “string”
},
{
“name”: “namespace”,
“type”: “string”
},
{
“name”: “page”,
“type”: “string”
},
{
“name”: “regionIsoCode”,
“type”: “string”
},
{
“name”: “regionName”,
“type”: “string”
},
{
“name”: “user”,
“type”: “string”
}
]
},
“metricsSpec”: [
{
“type”: “longSum”,
“name”: “sum_commentLength”,
“fieldName”: “sum_commentLength”,
“expression”: null
},
{
“type”: “longSum”,
“name”: “count”,
“fieldName”: “count”,
“expression”: null
},
{
“type”: “longSum”,
“name”: “sum_deltaBucket”,
“fieldName”: “sum_deltaBucket”,
“expression”: null
},
{
“type”: “longSum”,
“name”: “sum_added”,
“fieldName”: “sum_added”,
“expression”: null
},
{
“type”: “longSum”,
“name”: “sum_deleted”,
“fieldName”: “sum_deleted”,
“expression”: null
},
{
“type”: “longSum”,
“name”: “sum_delta”,
“fieldName”: “sum_delta”,
“expression”: null
}
],
“transformSpec”: {
“transforms”: [
{
“type”: “expression”,
“name”: “event_name”,
“expression”: “namespace”
}
]
}
}
}
}

Please suggest how can I get the values in columns which are transformed from columns from other datasource?

Thanks & Regards
Amit Srivastava

Please add event_name under dimension spec as well.

Hi Gaurav,

Thanks for your quick response. It’s working now.

Regards
Amit Srivastava
9899724484

Hi Gaurav,

One more issue, If i don’t remove the “namespace” column from dimension spec then value comes in “event_name” but column “namespace” also gets created.
I want only “event_name” in the new datasource with value and no need for a “namespace” column.
I have tried by removing “namespace” from dimension spec then no value comes in “event_name” column.

Please suggest.

Regards
Amit Srivastava
9899724484

Ah I think I have tracked down the issue mentioned by a previous reply:

https://github.com/apache/druid/issues/9914

Have you specified any dimensions in the dimensions part? Maybe that is the issue?