DRUID:Historical nodes Does really store data on Local Disk?

Hi,

I am very new to Druid, I just setup 3 Node Druid cluster

Node1 - Co-ordinator and Overlord

Node2 - Historical and Middlemanager

Node 3 - Broker.

I am using HDFS as deepstorage as below.

druid.storage.type=hdfs

druid.storage.storageDirectory=/druid/segments

Question:

Historical nodes Does really store data on Local Disk ?

Or it loads all the data from Deepstorage based on metadata information onto Memory while starting Druid process ?

Also If I am using local Storage - it has to be shared ? ( NFS Share) ?

Where i can set replication parameter for Historical nodes

Thanks

Siddesh

Revana

Historical stores a copy of the required data in the disk apart from Memory.

Storage directory should point to HDFS. For example:

druid.storage.storageDirectory=hdfs://NNHA/apps/druid/segments

Same is the case for index logging directory.

Yes – if you have multiple Historical / Middlemanager nodes (called data nodes) then the deep storage needs to be accessible from all the nodes. Each Historical has local storage and that needn’t be shared.

Replication parameter needs to be set as part of the load rules in co-ordinator console page by editing the Retention rules. Refer to attached sreenshot of co-ord console from Druid 0.14/ Imply 2.9.

Thanks & Rgds

Venkat

Hi Venkat,

Thanks for the quick response on this.

druid.storage.type=hdfs

druid.storage.storageDirectory=/druid/segments - This is for Deep Storage , HDFS shared across all Historical Nodes…

For individual nodes,

druid.segmentCache.locations=[{“path”:“var/druid/segment-cache”,“maxSize”:130000000000}] - This is the Place where all the Druid segments all stored right for particular node right ?

Thanks

Siddesh

For individual nodes , you are right.

But for DS, druid.storageDirectory should be a path to hdfs uri path location and not just /druid/segments.