I want to get the numbers of all combinations of several dimensions in “hour” granularity in recent a week, so I used “groupBy” to get it.
But the latency is about 15s~20s, I want to dramaticlly decrease the latency. How to improve it?
The machine type and druid setting were followed as this doc. http://druid.io/docs/latest/Production-Cluster-Configuration.html
There are 3 historical nodes serving for this cluster, and the number of data is about 70 million in a week.
Does increase the value of “druid.processing.buffer.sizeBytes” can improve it? It seems the max value of it is about 2147483647, because its type is integer.
By the way, if the max value(2147483647) of “druid.processing.buffer.sizeBytes” is not enough in cluster, how to process this situation?
And another question is about “druid.query.groupBy.maxResults”, if the result of “groupBy” exceed it, it will return “Maximum number of rows reached”.
Dose it can be dynamically set in json body in “groupBy” query or other method to avoid this error result?