Join Different Realtime DataSource in Druid


I have two topics which are collecting Impression and Conversion data in two different data source. For analysis we need to join these two data sources. Is there way using lookup or something else we can join these two datasources?



I dont think druid currently support data source join.You can join your stream before ingesting into druid.Also can use batch ingestion to purify segments later.Currently Samza offers local task storage which can also be used to join two streams.

You can query the datasources together (as if they are a single datasource) by using union queries. See “Union Data Source” here:

But if you are looking for a true join, that’s not supported by Druid. Folks generally do that in a pre-processing step before loading their data into Druid.