Question on druid memory configs

Hi,

According to http://druid.io/docs/latest/operations/performance-faq.html. I saw this recommendation:

memory_for_segments = total_memory - heap - direct_memory - jvm_overhead

What are the corresponding configs for those attributes? for my understanding, they are:

total_memory: Total RAM of the machine

**direct_memory: **-XX:MaxDirectMemorySize=2000m (found inside jvm.config)

heap: xms / xmx ???

jvm_overhead: ???

memory_for_segments: is it druid.processing.buffer.sizeBytes??

Thanks

Hi,

I’m not an expert but to my understanding you are right about the heap and direct memory configuration, but the memory_for_segments has nothing to do with the processing buffer setting. For java you can only configure the heap (xmx) and the direct memory (-XX:MaxDirectMemorySize). The rest of the system’s RAM will then be usable for memory-mapping the segments (as the documentation states, that should be the default unless the memory quota is limited explicitly via a linux cgroup setting). It says minus the jvm-overhead below to indicate that some memory will be eaten up by the java platform itself.

The following page is also a nice example of the kinds of settings you’d have on each node type: http://druid.io/docs/latest/configuration/production-cluster.html

To my understanding, the processing buffer is for storing intermediate query results. The performance FAQ says that both off-heap and on-heap memory is used by the processing buffer: “Historical nodes use off-heap memory to store intermediate results” … “On historicals, the JVM heap is used for GroupBy queries, some data structures used for intermediate computation”. I think that off-heap here is synonymous to direct memory.

see Inline

Hi,

According to http://druid.io/docs/latest/operations/performance-faq.html. I saw this recommendation:

memory_for_segments = total_memory - heap - direct_memory - jvm_overhead

What are the corresponding configs for those attributes? for my understanding, they are:

total_memory: Total RAM of the machine

**direct_memory: **-XX:MaxDirectMemorySize=2000m (found inside jvm.config)

direct_memory should be atleast -> (druid.processing.numThreads * druid.processing.buffer.sizeBytes )

generally a processing buffer of 512 should be enough and processing threads = cpu_cores - 1

heap: xms / xmx ???

3-4G heap size should be sufficient for small/medium size setups, GroupBys use heap memory, so you might need to set it higher to 6G if you extensively use groupBys.

jvm_overhead: ???

this includes the system overheads too i.e memory used by other OS daemons and processes, typically 1G-2G.

memory_for_segments: is it druid.processing.buffer.sizeBytes??

No, this means the system memory available for memory mapping of druid segment files.

it is defined by config druid.server.maxSize - the maximum amount of segments that will be assigned to a historical node.

Thanks Sascha & Nishant, I am cleared now :smiley:

Hi
Please explain me what is the purpose of direct_memory in druid?

How it is used? How it is related to VM ?

Hey,

This will give you an idea of how direct memory is used in druid http://druid.io/docs/latest/operations/performance-faq.html

Thanks

Abhishek