the Druid documentation states that the multi input spec can only reference a single datasource:
“This is a composing inputSpec to combine other inputSpecs. This inputSpec is used for delta ingestion. Please note that you can have only one
dataSource as child of
I do not understand why this is a requirement. Also, I looked into the code and couldn’t find any place in which Druid would check whether only a single datasource is present within a MultiplePathSpec.
The context is that I’m trying to ingest data separately per region / datacenter and then to merge this data into a single datasource but I don’t seem to find a good way to accomplish this.
UNION queries are too slow to use for this
index tasks along with Firehoses can do this but the documentation states that they are not appropriate for big data volumes