memory mapping of segment data

Hi,

I have couple questions related to memory mapping of segment data:

  1. If a historical node is started will it proactively memory map segment data or does it happen only when a query occurs?

If it happens proactively, which segment on the historical node will be mapped if the memory/disk ratio is larger than 1?

If it doesn’t happen proactively is there a way to enforce memory mapping of segments after a cold start?

  1. During some tests, I noticed that memory mapping seems to happen per metric and not per all segment data. Could you please shed some light on how memory mapping in Druid works in detail?

  2. The tests show no significant difference in query latency between 1:1 and 1:2 memory/disk ratios. What is the memory/disk ratio recommended to use?

Best regards,

Roman

Hey Roman,

Historicals proactively memory map segments. What they do is pretty simple: they just download and mmap(2) all currently active segments, regardless of whether they’re being queried or not. Then it’s up to the OS to determine which pages should be cached and which shouldn’t.

The best memory/disk ratio depends on your workload. In particular it depends on what time filters and column selections are typical for your queries. Most people figure this out through experimentation.