Druid hortonworks schema registry kafka ingestion failure

Hi,

I’m trying to stream/read data from a kafka topic in avro format that was serialized using hortonworks schema registry through nifi but things doesn’t seem to work for me. looks like druid 0.17 only works with confluent schema registry through

"avroBytesDecoder" : {

"type" : “schema_registry”,

"url" :

}

my kafka json ingestion spec:

{

“type”: “kafka”,

“dataSchema”: {

“dataSource”: “location”,

“parser”: {

“type”: “avro_stream”,

“avroBytesDecoder”: {

“type”: “schema_registry”,

“url”: “http://127.0.0.1:9090/api/v1

},

“parseSpec”: {

“format”: “avro”,

“timestampSpec”: {

“column”: “timestamp”,

“format”: “auto”

},

“dimensionsSpec”: {

“dimensions”: [

“view”

]

}

}

},

“metricsSpec” : [

{“type”: “count”, “name”: “countagg”}

],

“granularitySpec”: {

“type”: “uniform”,

“segmentGranularity”: “HOUR”,

“queryGranularity”: “HOUR”,

“rollup”: false,

“intervals”: null

}

},

“tuningConfig”: {

“type”: “kafka”,

“reportParseExceptions”: false,

“offsetFetchPeriod” : “PT120S”,

“logParseExceptions”: true

},

“ioConfig”: {

“useEarliestOffset”: true,

“topic”: “location”,

“replicas”: 1,

“taskDuration”: “PT120M”,

“completionTimeout”: “PT240M”,

“consumerProperties”: {

“bootstrap.servers”: “localhost:9092”

}

}

}

Attached is the error i’m getting.

Regards,

/

error.log (34.8 KB)

I’m not sure what’s going on here. It might be some incompatibility between the confluent library we’re using and the hortonworks implementation of the schema registry. Perhaps upgrading one or the other would help?

Hi Gian,

Thanks for the feedback, after going through code, their is a difference between confluent and hortonworks on how they implement their schemaid one is a 4 byte integer while another is a 8 byte long among other details. Did a git clone on druid source haven’t modified any code everything builds successfully all test pass but i’m getting this error:

“exceptionClass”: “java.lang.IllegalArgumentException”,

“message”: “Could not resolve type id ‘avro’ as a subtype of org.apache.druid.data.input.impl.ParseSpec: known type ids = [csv, javascript, json, jsonLowercase, regex, timeAndDims, tsv] (for POJO property ‘parseSpec’)\n at [Source: UNKNOWN; line: -1, column: -1] (through reference chain: org.apache.druid.data.input.impl.StringInputRowParser[“parseSpec”])”,

“streamException”: false

looks like avro is not a known format or missed out.

Thanks.

Collins.

the problem below was due to a configuration issue, so i can now test my code and may be i can add the enhancement for others to use.