Ingestion spec - flattenSpec: AVRO record parsing

Hi guys,

Using the last version of Druid I wrote my spec to ingest AVRO records from Kafka. What I’m trying to do also is to make a field flatten but the Druid parser can’t access to it.

The Kafka AVRO record:

[…]
{
“field1”:null,
“field2”:null,
“field3”:null,
“field4”:{
“com.bla.bla.bla.Field”:{
“nestedField”:{
“string”:“value”
}
}
}
}
[…]

``

My spec:

[…]
“parseSpec”:{
“format”:“avro”,
“timestampSpec”:{
“column”:“ts”,
“format”:“auto”
},
“flattenSpec”: {
“fields”: [{
“type”: “path”,
“name”: “nestedField”,
“expr”: “$.???.nestedField”
}]
}
[…]

``

See the question marks in the spec? I’ve tried

$.device[‘com.bla.bla.bla.Field’].nestedField

``

as well as

$.device.com.bla.bla.bla.Field.nestedField

``

but I get a PathNotFoundException exception.

Do you have any suggestion?

Thank you in advance

Hey there,

You want to escape the dots in the key name by using brackets.

e.g.

$.field4.[‘com.bla.bla.bla.Field’].nestedField

``

Will get the nestedField in your example. http://jsonpath.com/ is a handy tool to try out different expressions. (Although I don’t think its parser is exactly the same as what Druid uses)

Cheers,

Dylan

Hi Dylan,

Thanks for your reply!

I followed your approach too but I still get something like:

com.jayway.jsonpath.PathNotFoundException: Expected to find an object with property [‘deviceName’] in path $[‘device’][‘com.powerspace.analytics.avro.Device’] but found ‘null’. This is not a json object according to the JsonProvider: ‘io.druid.data.input.avro.GenericAvroJsonProvider’.

``

So, none of the following work:

.field4.com.bla.bla.bla.Field.nestedField .field4.[‘com.bla.bla.bla.Field’].nestedField
$.field4[‘com.bla.bla.bla.Field’].nestedField

``

I’m checking out the source code to see if I can figure something out. Otherwise I have to locally debug my tasks.

Solved. I hope this is useful for someone else:

A few messages from the Kafka topic contained

“field4”:{

`` “com.bla.bla.bla.Field”:null
}

So the solution was to use the JsonPath wildcard power and set my path to:

$.field4…nestedField

``

Now it works like a charm.