High disk usage on realtime nodes

hello all,

My realtime nodes are using a lot of disk. I have my intermediatePersist period set to 1 minute for right now. Do realtime nodes empty these rows from disk when handing off segments? If not, are there configurations i can set to help with this?

Thanks guys,


where do realtime nodes persist to in that intermediate period? what is the default location?

Disregard my last post. I found the location in the docs. druid.segmentCache.locations

Which leads to another question: why should the maxSize of this directory always be zero?

ok so i might be confused now. what is the difference between basePersistDirectory in schema spec and druid.segmentCache.locations in realtime runtime properties?

Hey Nicholas,

The basePersistDirectory is the directory that realtime nodes will store data in before it is handed off. The realtime nodes do remove data from their basePersistDirectory when it’s handed off.

The druid.segmentCache.locations are the directories that historical nodes will download segments to and memory map them from.

thanks Gian! youre always so helpful