Hi Druid team,
I have been trying to optimize historical nodes configuration, checked out lots of docs / group posts, but still haven’t found the answers to all the questions that I have. Could you please help?
The questions are the following:
- Druid uses memory-mapped files technique to map the segments into memory on historical nodes. The question is, are those segments mapped outside of allocated to the java process memory, or inside it? I know the formula is:
memory_for_segments = total_memory - heap_size - (processing.buffer.sizeBytes * (processing.numThreads+1)) - JVM overhead (~1G).
Does total_memory here mean all available RAM on the machine? If yes, that means that Druid maps segments outside of the java process, which means we need to find a tradeoff between (directmemory + heap) and free memory.
What is a healthy memory for segments / node maxSize ratio (memory / disk ratio) for a historical node?
As for caching, if we use local cache, what is the feasible cache size per node, and where is this cache physically stored - in heap or direct memory? If it’s in direct memory, then the direct memory setting should be increased by the size of cache, shouldn’t it?