We use Imply 0.9.1.1
Our cluster on AWS is based on 30 i3.8Xlarge
After deep investigation on query performance we’ve discovered a very obscure behavior.
**The problem occurs when we make query and sometime without any logical reason the query time is huge. We almost timeout before getting any response from broker.**We think this can occurs because the Coordinator doesn’t stop to move the segment from machine to other machine.In fact the move of segment continue even when we do not add or remove any data from the cluster. I understand the idea of rebalance to optimize data storage but it have to stop if there is no update on the data (no add, no remove, no task at all) !!!
When reading the code we fall on this class
In the version 0.9.1.1 it seems that all re balance of segment use this class .
We can’t actually explain the query performance problem occurring randomly we can just confirm that all the metrics show that query time in Historical node goes up and we’re unable to find why.
We stay only with presumption that it could be related to the fact that the data keep to be re balanced even when nothing occurs