Howto NOT query historical nodes when lookups are still loading

Hi!

We have the following problem with our cluster: When a node needs to be restarted (manually or because the server was down for any reasons) it registers itself at the coordinator as “ready for queries” before the lookups are loaded. This leads to exceptions when to broker start to use the recently started node.

We use lookups to map IDs to names for many dimensions and it takes a few minutes until everything is filled again.

Is there any way to change this behaviour? Is there a way to see if all lookups are filled properly or still empty? Can we somehow disable a historical node until the lookups are filled?

We are on version “0.15.1-incubating” atm.

Thanks for you help!

Christian

Does looking at the status help?

https://druid.apache.org/docs/latest/querying/lookups.html#list-load-status-of-lookups-in-a-tier

No, unfortunately not.

The problem is that the historical node announces itself to the Coordinator after it has loaded all segments into the cache at the startup. At this stage the lookup are not filled yet but the Coordinator tells the Brokers to already use the freshly booted historical node. This causes all queries which use lookups to fail until the lookups are finally filled after some minutes (we have many lookups).