Concurrent queries: Scale broker nodes or historical nodes


If there are frequent concurrent queries made on druid cluster should the number of brokers be increased or historical?

My understanding is broker connects with historicals as well as realtime nodes in order to get segments required to respond to the queries. If there are concurrent requests, would having multiple brokers help? Or does adding more brokers won’t help since the number of historicals remain same?



It depends somewhat on the types of queries being issues, but typically you would need to scale out your historical nodes much sooner than the brokers, since the historicals do the majority of the heavy lifting work. A rule of thumb would be one additional broker for every 10-20 additional historicals. If you look at CPU usage and find that the broker is maxing out CPU utilization and the historicals are not, that’s one indication that you could benefit from additional brokers.