[druid-user] Clarification on Heap versus Direct Memory

I am trying to figure out what is the difference between Heap and Direct Memory. I have zero knowledge in Java so please forgive me.

I am tuning my MiddleManager node and I am encountering gc overhead limit exceeded errors.

In the docs (as previewed below), it states that Tasks in MiddleManager have a memory consumption formula. Does this mean that the single task memory usage is Heap + Direct Memory combined or is it depending on usage?

From the Docs:

Total memory usage

To estimate total memory usage of a Task under these guidelines:

  • Heap: 1GB + (2 * total size of lookup maps)
  • Direct Memory: (druid.processing.numThreads + druid.processing.numMergeBuffers + 1) * druid.processing.buffer.sizeBytes

The total memory usage of the MiddleManager + Tasks:

MM heap size + druid.worker.capacity * (single task memory usage)

gc overhead always refers to heap. If you don’t have enough direct memory then task won’t start and will clearly give an error mentioning direct memory was not set correctly.