Problems to select “Avro Stream” input format when creating an ingestion spec using the Apache Druid UI

Hello!
I have a problem due to “avro stream” ingestion format.

According to the official documentation, when loading the extension “duid-avro-extensions”, the three following parsing options should be available: Avro Stream Parser, Avro Hadoop Parser and Avro OCF Parser.

The following environment variable has been set in the Dockerfile of every Druid service running on the cluster:
ENV druid_extensions_loadList=’[“druid-histogram”, “druid-datasketches”, “druid-lookups-cached-global”, “postgresql-metadata-storage”, “druid-kafka-indexing-service”, “druid-hdfs-storage”,“druid-avro-extensions”,“druid-basic-security”,“druid-parquet-extensions”]’

In the Druid UI, the extension seems to be correctly loaded.
Screenshot_3

When creating an ingestion spec from a Kafka topic, the following data input formats are available.

Screenshot_4

Expected results: The Avro Stream Parser is available, so the Schema Registry URL can be provided and the avro data from the stream can be readable.

Thanks for your time

I don’t have a setup to test with right now, but I wonder whether this is an issue w/druid, or with the console. Do you have a spec you could try to submit using the API? In case it helps, I’ll paste one I used in the past for testing on my laptop, maybe you can work from it. The API call I used in my case was

curl -X 'POST' -H 'Content-Type:application/json' -d @spec.json http://localhost:8090/druid/indexer/v1/supervisor

And since I don’t see a way to attach a file, I’ll paste my spec.json here, too:

{
  "type": "kafka",
  "dataSchema": {
    "dataSource": "myDruidDataSource",
    "timestampSpec": {
      "column": "eventReceivedTimestamp",
      "format": "millis",
      "missingValue": "2000-01-01T01:02:03.456"
},
    "dimensionsSpec": {
      "dimensions": [
        {
          "name": "eventReceivedTimestamp",
          "type": "long"
        },
        "eventType",
        "group1.col1",
        "group1.col1.val1",
        "group1.col1.val2"
      ]
    },
    "metricsSpec": [
      {
        "type": "count",
        "name": "count"
      },
      {
        "type": "filtered",
        "aggregator": {
          "type": "count",
          "name": "group1.col1.val1"
        },
        "filter": {
          "type": "and",
          "fields": [
            {
              "type": "selector",
              "dimension": "eventType",
              "value": "myKeeperVal",
              "extractionFn": null
            }
          ]
        },
        "name": "myKeeperEvent"
      }
    ],
    "granularitySpec": {
      "type": "uniform",
      "segmentGranularity": "DAY",
      "queryGranularity": {
        "type": "none"
      },
      "rollup": true,
      "intervals": []
    },
    "transformSpec": {
      "filter": null,
      "transforms": []
    }
  },
  "tuningConfig": {
    "type": "kafka",
    "appendableIndexSpec": {
      "type": "onheap"
    },
    "maxRowsInMemory": 1000000,
    "maxRowsPerSegment": 5000000,
    "intermediatePersistPeriod": "PT10M"
  },
  "ioConfig": {
    "topic": "6867",
    "inputFormat": { 
      "type": "avro_stream",
      "avroBytesDecoder": {
        "type": "schema_registry",
        "url": "http://localhost:8081" 
      },
    "binaryAsString": false
    },
    "consumerProperties": {
      "bootstrap.servers": "localhost:9092"
    },
    "useEarliestOffset": true,
    "useEarliestSequenceNumber": true
  }
}

Thank you for your prompt response. I can prepare a working script using our data sources, the one you provided does not work for me.
I hope to be able to do it tomorrow or Monday, I will write here the results.

Thank you very much! :slight_smile:

Good afternoon, I have migrated to version 0.22, I have replicated your example and I can confirm that it is possible to ingest data in “avro_stream” format using a specs file via curl, but it is not possible to create it from the UI. It would be interesting to add this feature to the druid UI, but for now, if it works, that’s good enough.

Thank you very much for your time and dedication.