HDFS/S3 vs Local storage

Hello everyone,

I’m new to druid.io and Hadoop/HDFS. I’ve already configured Druid on single machine with success.
So now I’m trying to configure cluster to be able to handle about 5 billion records.

I’ve read the docs, specially http://druid.io/docs/latest/configuration/production-cluster.html and http://druid.io/docs/latest/tutorials/cluster.html
But I’m not sure about deep storage on HDFS/S3. I have 1Gbps network on my servers so it Limits me to 100MBps vs 1GBps on RAID 10 4x SSD.
In such a case isn’t it better to keep historical nodes on local disk? I understand it’s not safe regarding data lose, but anyway im keeping raw data in other databases a different server. Also i don’t need realtime data - I only index data once every month from other database.

For now I 3 servers have available for Druid - 2x 256GB RAM, 20c/40t + 1x 128 GB RAM 6c/12t, all with SSDs. So I tought about using it in the following way:

  1. 256GB RAM for Historical node + Overlord node
  2. 256GB RAM for Middle Manager + Coordinator
  3. 128GB RAM for Broker Node + Zookeeper
    Does it seem reasonable?

Any help will be highly appreciated. Thanks in advance!

Hi Michael,

Welcome to the Druid community!

One thing I want to clarify is that Druid uses deep storage only as a backup of your data and as a way to transfer data in the background between Druid nodes. To respond to queries, Historical nodes must prefetch all active segments to their local disks before any queries are served. This means that Druid never needs to access deep storage during a query, helping it offer the best query latencies possible. It also means that you must have enough disk space both in deep storage and across your Historical nodes for the data you plan to load. And it also means that network speed to your deep storage isn’t really critical for performance. And finally - it means that deep storage must be something shared between Druid nodes, like S3/HDFS. If it’s a locally mounted filesystem, it must be a network mount so all machines can see the same filesystem.

If you’re not doing any realtime data then I would probably do something more like the following.

  1. 256 for historical + middleManager

  2. 256 for historical + middleManager

  3. 128 for broker + zk + overlord + coordinator

Note that this will not be a highly available config, since you only have one broker, zk, overlord, and coordinator. For high availability you’ll want 3 or 5 ZKs, and at least 2 of each of the other processes.

Hi Gian,

Thank you for nice explanation, that’s explain me everything.


W dniu wtorek, 16 stycznia 2018 17:44:38 UTC+1 użytkownik Gian Merlino napisał: