How to keep the segments in memory all the time in Druid Historical nodes

According to the documentation, druid historical node uses memory mapped and it will load the segments into the memory at query time. However, as we have just created a “hot” historical node which has 24 of data we call it daily_tier. How do we configure historical nodes to always keep that in memory so Druid doesn’t have to load it from disk? We have only about 100GB of data per day.

This is our JVM opts

java -server -Xmx12g -Xms12g -XX:NewSize=6g -XX:MaxNewSize=6g -XX:MaxDirectMemorySize=18g -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dcom.sun.management.jmxremote.port=17071 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -classpath .:/usr/local/lib/druid/lib/* io.druid.cli.Main server historical

``

Or should we just increase the memory in JVM?

We disable caching on historical and enable on broker instead

The reason I asked is because it seems like even though we have a dialy_tier cluster, every time we query the broker with a day worth of data historical seems to be re-loading the data. We noticed that from the query time. The first query would take at least 40-60s to finish, the subsequent queries then would be fast I assume it’s already been cached in the broker’s memcached.

You want to keep the JVM heap size reasonable, and leave as much free RAM as possible to the machine’s page cache.

If you have decent monitoring on the box you can watch page faults, etc. and see if it’s doing what you expect.