In the documentation on connection pooling (yet to be published) i read the following:
druid.server.http.numThreads should be set to a value slightly higher than the sum of
druid.broker.http.numConnections across all the Brokers in the cluster.
I recently had issues with groupBy queries taking way too long(couple of seconds) to execute when there are multiple concurrent queries. Part of the problem was large number of very small segments that existed for my datasource (having set segment granularity to 15 minutes) which after getting some guideline on this forum has been somewhat resolved. The other aspect was queries waiting to be executed (seen from query/ttfb).
If each broker has
druid.broker.http.numConnections set to 300 on a 2 broker cluster which has 2 historicals, should
druid.server.http.numThreads on each historical be set to 600? or should this be 300?
Since the brokers would be behind a load balancer, does this mean 600 concurrent queries can be processed by the cluster?
Doesn’t how many concurrent requests can be processed by brokers and historicals doesn’t have anything to do with allocated number of cores and RAM? Even though the brokers can accept 600 connections and make that many requests to historicals, historicals should be able to process that many requests at a time. How can we size the allocation of number of cores in line with druid.server.http.numThreads on historicals? How does druid.processing.numThreads be adjusted to service that many requests?
The logic behind the recommendation is that the number of threads available to handle requests on each historical should be roughly equal to the maximum number of outgoing connections the pool of brokers can make. Since druid.broker.http.numConnections determines the max connections a broker will make to a single historical, 600 would be the right number according to the recommendation. But! - a value of 300 for druid.broker.http.numConnections is extremely high, and doesn’t make sense unless you have a broker with a lot of cores (like a hundred cores). The default of druid.broker.http.numConnections=20 and druid.server.http.numThreads=60 (on the historical) are more reasonable starting points.
Yes - absolutely your concurrency will be limited by the number of cores you have, and an excessive number of threads will lead to poor performance due to overheads such as thread context switching. I suspect that your poor performance when you have concurrent queries has to do with either druid.processing.numThreads or druid.processing.numMergeBuffers being too low. As noted on the configuration page (https://druid.apache.org/docs/latest/configuration/index.html#historical-query-configs), each groupBy query requires a merge buffer so the number of merge buffers you have will effectively be your concurrency limit.
In summary - reduce your HTTP thread pool size and increase your processing thread/merge buffer pool size.
I would also strongly encourage you to use compaction tasks or a similar mechanism to reduce segment fragmentation as you will see by far the greatest performance gains once this is fully resolved. Aim for a target segment size somewhere between 300-700MB.
Could you answer few follow up questions?
I did try compaction and saw significant improvement for slow queries. From what i learnt, the issue was due to small sized segments (15 mins segment granularity) which need to be combined to respond to queries for data with 6 months interval. Since my data size was very very low after compacting i ended up with 3-4 MB data which is way below 300-700 MB.
It is still not clear to me what exactly limits query concurrency. If a single groupBy query takes ~15-20ms, I wouldn’t want say 1000 queries to end up with avg time of 200-300ms which is what is happening at the moment. I thought having more processor and RAM allocated would mean queries would run quicker allowing next queries to run sooner. But there would be some time a query would spend in queue before getting chance to execute.
Doesn’t druid.server.http.numThreads on historical allow more queries to be processed at the same time?
If druid.processing.numMergeBuffers puts a limit on number of concurrent queries, and default value is 2 for that, isn’t this very low? Can we have more MergeBuffers? What puts limit on more numMergeBuffers?
3. What does it mean to have druid.processing.numThreads=7 and druid.processing.numMergeBuffers=2? Does this mean the processor would be underutilized? If both these paremeters are set to 7 on a 8 core vCPU node would it speed up things?
4.What leads to queries being processed faster ensuring that new queries that come in, doesn’t have to spend too much time waiting in queue?
If you don’t have much data coming in, I would recommend switching to DAY segment granularity and also going back and re-indexing the existing data to be DAY granularity.
There are many different factors on the broker/historical/worker(for realtime) that can limit query concurrency. druid.server.http.numThreads could be one factor, but increasing the HTTP thread pool size does not automatically mean higher concurrency if your bottleneck is somewhere else. Setting druid.processing.numThreads=7 on an 8 core instance is a good value, as it usually results in high CPU utilization while saving some CPU capacity for other overhead operations. The default of 2 merge buffers is a reasonable starting point for clusters with mixed query types, since not all query types use merge buffers. If the majority of your queries are groupBys, I would look into increasing this - maybe try setting this to 4 and see if that helps. You can have as many merge buffers as you want, limited by available RAM, but allocating too many will affect your performance (and the buffers will be underutilized since you’ll now be processing-thread constrained) because it takes RAM away from the OS page cache. If there’s not a lot of unused memory on the machine that the OS can utilize as a page cache, it will be forced to constantly evict and reload pages of segment data from disk and you’ll now be I/O constrained.
I don’t know how much memory your machines have, but assuming you have > 12GB, my recommendation on the historical would be to set:
druid.processing.buffer.sizeBytes=300000000 (or even less since your segments are so small)
If you have less memory, reduce the processing buffer size and the JVM heap (Xms/Xmx).
After doing this, again focus on optimizing your segments as you will see by far the greatest performance gains from properly sized segments.
Thank you David for the detailed explanation! It is much clear now.