For every historical, there is a setting “druid.server.maxSize” which is equal to 500GB. I run 12 historicals serving a total of 6TB. What is the optimum size for a historical? I found no noticable difference with six 1 TB historicals as opposed to twelve 500 GB historicals. Is there any observation on this?
Unfortunately, there is no optimum size. You’ll find general guidelines and rules-of-thumb in the basic cluster tuning doc.
Regarding druid.server.maxSize, it “controls the total size of segment data that can be assigned by the Coordinator to a Historical.” What’s the size of your segment data? That may explain why you’re not seeing a difference.
The total size of all the segment data is about 5.7 TB. Currently, I am running 12 historicals each configured with a druid.server.maxSize=500GB. Should I just have six 1 TB-sized historicals instead? I believe, given all the other parameters left unchanged, too few historicals can cause queries to slow down. Too many historicals may add to overhead latencies. Any sweet spot for this?
Thanks for providing your segment data size. I’m sorry I didn’t ask this in my prior response, but are you collecting any metrics?
I’m linking to a discussion about cluster sizing. You’re asking a difficult question, and, unfortunately, there is no simple answer. Stated differently, and quoting from the linked discussion, “[t]his is a complex issue that needs an in-depth analysis . . . .”
Back to your original question, when you were looking for differences between six 1 TB historicals and twelve 500 GB historicals, what were you measuring? For good performance, you’ll want enough historicals to have a good (free system memory / total size of all druid.segmentCache.locations) ratio.
I’ll look forward to continuing this discussion.
Hello, we run fairly large Druid clusters and for us, we have a golden rule of 1:10 ratio between Historical’s RAM and disk. And usually 10 CPU is enough for each Historical.
For example, if there’s 256GB RAM available for each historical, we then attach 2560GB SSD to it.
This golden rule works pretty well.
+1 on what Didip mentioned.
Maybe a bit complex calculation could be a Total memory = 10-15% of
actively queryable data size as free memory for the operating
system(page caching) + jvm heap ( lookup and historical heap) +
Direct memory for buffers. The calculation for jvm heap and DM is
available in the basic cluster tuning doc which Mark pointed out.