How does Druid 0.17/0.18 support customized the input row parser

Our service uses Druid 0.16 and adopt Kafka indexing service to enable real time reporting. The format of records in Kafka topic doesn’t match the format of Druid segments we want to store. For example, the Kafka topic data has a field called “labels” comprising a list of labels delimetered by comma. However, we want to split “labels” when kafka indexing service consume data from the topic. We achieved such goal of the format transformation by extending the ByteBufferInputRowParser and claimed it in the ingestion spec. The spec file looks like below:


“type”: “kafka”,

“dataSchema”: {

“dataSource”: …,

“parser”: {

“type”: “myRowParser”,

“parseSpec”: {

“format”: “json”,



“dimensionsSpec”: …



“metricsSpec”: …,

“granularitySpec”: …


“tuningConfig”: …,

“ioConfig”: …


I noticed that the component “parser” ( was deprecated in ingestion spec file after Druid 0.17. But the official doc doesn’t shed any details or concrete examples about how to create and use customized row parser in Druid 0.17/0.18. Can anyone provide suggestions? It was a hard block for our team to migrate from 0.16 to 0.18