[druid-user] Re: Kafka consumer Speed on Druid

Increasing Kafka partitions for a particular topic will increase the degree of parallelism.

You can reduce the processing time per row by being neat about your schema, for sure - and if you’ve done that, you can scale out Kafka ingestion by adding more subtasks to your supervisor - you want a nice harmony ratio between the number of Kafka partitions (as per the above post!) and the number of subtasks = “taskCount”
https://druid.apache.org/docs/0.20.1/development/extensions-core/kafka-ingestion.html#kafkasupervisorioconfig

Note that your Middle Manager configuration file sets a cap on the number of slot available for work to do - including these subtasks.
See Configuration reference · Apache Druid → “druid.worker.capacity”

Each subtask is assigned to a single core on your data node - you should be getting 10,000 events per second per core.

TT - if you’re not already, probably a good idea to get your metrics from Druid being emitted so you can look at your lag times.