Druid Kafka Indexing Service: 'Preselect' data depending on value of field/row/dimension


Imagine there is a Kafka topic with a field/dimension ‘country’. This field holds three different values: DE, AT and US.

Now my question is:
Is it possible to preselect data from kafka topic at ingestion time, depending on the value of ‘country’ and store it in different druid datasources?

I want to create three different ingestion specs, run three different kafka indexing supervisors and get three different druid datasources. (but all three reading data from ONE kafka topic)
Eg: my_datasource_AT, my_datasource_DE and my_datasource_US

Thanks, Alex

In general, message/event filtering at index time is not a feature provided by KIS (or other indexing methods). If you actually want to do this, you will need to create a front-end processor that streams your source kafka topic to separate country topics. Are you sure you have a compelling reason to not use a country filter at query time and use a single datasource?


at the moment we use a single datasource and filter results at query time. We also use SuperSet as a selfservice tool for our “customers” to visualize data.’
In reality we do not have this dimension ‘country’, we have something like ‘company’.

The problem is, that we want, or better we have to prevent that people vom company A are able to query data from company B or company C and so on…
Our first thoughts were, we could handle this with different datasources. As I understand, we have to handle this by creating seperate topics in kafka.

Regards, Alex