We have multiple broker nodes running on Druid cluster. We have created an AWS load balancer for the Druid broker nodes. I have a question regarding to druid broker configuration. In Druid broker runtime.properties, how should I configure druid.host? should I use the load balance name or use EC2 instance hostname on each broker node?
druid.host is meant to be the name or address that is both (a) how other nodes can reach this node, and (b) unique. So you’d want to use the instance hostname or address. Alternatively you can leave it blank, and Druid will default to the machine’s own canonical hostname. That usually works in most environments.
Thank you, Gian.
So we only call the load balancer when we make a call to query Druid from outside, correct?
I noticed after I set up the load balancer on top of druid broker nodes, Druid broker does not get populated with newly added data source (I checked the broker logfile, do not see new data source name showing in the logfile, while I can see new datasource data showed under segments-cache directory). I wonder why broker behave like this after adding a load balancer?
Yeah, the load balancer would be for you to use, not for Druid.
The load balancer setup / Druid dataSource population thing sounds like a coincidence. There shouldn’t be a relationship there.