Multiple topics ofr kafka indexing service


I have two kafka topics containing data of the (almost) same schema. Some fields that are used for different metrics are different, but a super-schema can be easily defined.
What would be the best way to ingest from those two topics into one data source using kafka indexing service (exactly-once consumption is important) ?

I understand it is possible to do by creating one topic with a mix of two streams (merging two topics), but it simply requires additional ressources for almost nothing, and it is one more place to introduce duplicates.
Technically, it seems there should be no problem to consume from multiple topics (just getting data from more pairs topic-partition), maybe I’m missing something?

Alternatively, is there a way to effectively query (groupBy) a union of two datasets having similar schemas?

Thank you for any clarification!

AFAIK, ingesting from multiple topics into the same datasource is not possible at present using kafka-indexing-service.
What you can do is to ingest them in separate datasources and then do union query multiple datasources


Union Queries should be always sent to the broker/router node and are NOT supported directly by the historical nodes.

Will it be effectively queried?


I didn’t ran much benchmarks with Union queries myself, IMO They should be fine, unless you are doing groupBy with huge result set.