Union datasources work with all query types, you should be able to do something like:
“dataSources”: [“one”, “two”]
The behavior you get is that all segments from all unioned datasources are queried together, as if they came from the same datasource. If one of the datasources is unavailable then it won’t be included in the query results, but the others still will (same as if some segments from a single datasource were unavailable). If any of this ever doesn’t work then it is a bug, so please report it.
If you’re using a datasource-per-datacenter, then you can also tell the overlord stuff like “all work for datasource X should be given to workers in datacenter Y” (using a worker assignment strategy).
Tranquility does do linear shards. If you end up writing a patch that adds an offset feature and finding value in that, I’m open to merging that. It shouldn’t be too invasive. I think we’d want to mark it as advanced-users-only though.
About the Kafka stuff, someone’s gotta run experimental features in production in order for them to stop being experimental