We have several weeks worth of data residing on 15 historical nodes and when they were running full, I removed one week of data using a drop rule.
Before the dropping of data, the coordinator console showed that all historicals were equally “full”, at 98% or something.
Immediately after the deletion, I’m seeing the following in the coordinator console:
I was surprised to see that most of that data seemed to have resided on a single historical and that some of the data also seemed to have been one all but one of the others. Is this how things should be or shoud I be concerned? We are having performance issues and I was wondering whether they might be due to this uneven distribution.
We also have set the replication level to 2, so I was thinking that maybe one replica of each segment ended up on the same historial while the other replica got distributed among the others.
However, prior to the deletion, the cluster has had days to balance itself and I did see the cluster rebalance the segments after the deletion.
Is there a quick and easy way to see whether the data is not only spread out evenly across the historicals in terms of volume but also so that parallized processing of a query will be good?