Segment Cache: Storage Recommendations

Hi there!

What are the storage recommendations for segment cache?
Only local SSD? What about a network-attached storage based on SSD?

By the way, what are the expected performance metrics (e.g. IOPS per GB)?



@jon.king - a topic you may have something to say about.

Well, since any queries that are getting any data that has already made it to deep storage have to use the segment cache (AKA the disk associated with the historical nodes), you want to make sure your segment cache is as fast as possible. SSDs are of course preferred, and we have seen that some types of NAS, depending on the throughput, can slow a cluster to a halt.

If you haven’t seen it already, I would recommend looking at tiering options. That would allow you to assign more expensive, faster disk to “hotter” data sets, and slower, cheaper disk to less important or “colder” data.

Thanks, Rachel.

I’m considering tiering on production however right now I’m detaling the essential infrastructure requirements. Yes, it depends on a lot things (data volume, query latency, etc.)

Anyways, let’s suppose we would provision the infrastrucutre on AWS, which EBS volume type (SSD) would you suggest?