I have a 2 node druid setup where each node has 64GB ram and 6TB disk with 40 CPU cores.
I need to configure druid for best performance of the TopN query over different types of aggregations namely : longSum and hyperUnique
With the current setup a normal TopN query over a single valued dimension with longsum aggregation takes around 20+ seconds when fired for first time and approx 2 secs when queried repeatedly.
On the historical node ,the metric for “query/wait/time” has value of “17935” when queried for first time.
A topN query over single valued dimension but with hyperunique aggregation takes around 40 seconds when fired first time and around 10 sec on repeated querying.
The JVM options passed to historical node are : " -Xmx4g -Xms1g -XX:MaxNewSize=2g" and i also i verified that load average is not high on these nodes during query.
The other relevant configuration of historical node are
The broker configurations are :
-Xmx4g -Xms2g -XX:NewSize=1g -XX:MaxNewSize=2g -XX:MaxDirectMemorySize=64g
I also observed that in broker metrics “query/node/ttfb” is almost same as query time hence i believe most of the query time is spent on the historical nodes
The datasource i am querying has 933 segments each of ~260 MB and since query is without any filters it applies to entire data.
I wanted to understand what could be the reason of such a high wait time?
And also which configurations will help me optimize the cluster for topN queries?