Hi,
I am very new to Druid, I just setup 3 Node Druid cluster
Node1 - Co-ordinator and Overlord
Node2 - Historical and Middlemanager
Node 3 - Broker.
I am using HDFS as deepstorage as below.
druid.storage.type=hdfs
druid.storage.storageDirectory=/druid/segments
Question:
Historical nodes Does really store data on Local Disk ?
Or it loads all the data from Deepstorage based on metadata information onto Memory while starting Druid process ?
Also If I am using local Storage - it has to be shared ? ( NFS Share) ?
Where i can set replication parameter for Historical nodes
Thanks
Siddesh
Revana
Historical stores a copy of the required data in the disk apart from Memory.
Storage directory should point to HDFS. For example:
druid.storage.storageDirectory=hdfs://NNHA/apps/druid/segments
Same is the case for index logging directory.
Yes – if you have multiple Historical / Middlemanager nodes (called data nodes) then the deep storage needs to be accessible from all the nodes. Each Historical has local storage and that needn’t be shared.
Replication parameter needs to be set as part of the load rules in co-ordinator console page by editing the Retention rules. Refer to attached sreenshot of co-ord console from Druid 0.14/ Imply 2.9.
Thanks & Rgds
Venkat
Hi Venkat,
Thanks for the quick response on this.
druid.storage.type=hdfs
druid.storage.storageDirectory=/druid/segments - This is for Deep Storage , HDFS shared across all Historical Nodes…
For individual nodes,
druid.segmentCache.locations=[{“path”:“var/druid/segment-cache”,“maxSize”:130000000000}] - This is the Place where all the Druid segments all stored right for particular node right ?
Thanks
Siddesh
For individual nodes , you are right.
But for DS, druid.storageDirectory should be a path to hdfs uri path location and not just /druid/segments.