Separated Replication


I want to use Druid to build the following scenario:

I want to separate Data, which is very far away geographically, as in different data centers. I need each Historical to contain it’s data, but never any data of a far away Historical, except the Cloud Server, whichis supposed to contain everything. So is there a way to say for example using a Kafka Supervisor Config:-> replicas: 2-> but only to Historical A and Cloud Server?
Best Regards,Calvin


You could probably configure something like this using load rules:

You’d place Historical A, Historical B, and Cloud Server all in different tiers, and for a given datasource, you’d specify a rule to load one replica on Historical A’s tier and one replica on Cloud Server’s tier (and none to load on Historical B’s).

One general note to consider - ZooKeeper doesn’t perform particularly well over high-latency links, so it may be a better idea to set up two completely separate Druid clusters rather than trying to have a cluster shared across multiple data centers.

Thank you very much!

I’ll treat them as different databases then and just replicate the data, that I need with an own solution.
The hint to load rule config is still very useful!