Hello,
When loading data from S3 after hadoop indexing we often see an uneven distribution of data on our historical nodes. Recently one node was more than 90% full while others were ~60% full. Is this a problem? Does it suggest a configuration problem?
Thanks for your help,
Morri Feldman

Forgot to mention that we are running imply-2.0.0 / Druid 0.9.2
It could just mean that your cluster is balancing slowly. There’s a throttle called “maxSegmentsToMove” that you can edit if you click the pencil in the coordinator console. The default value is pretty low.
Thanks @Gian
Unfortunately, we are already running with maxSegmentsToMove set to 300.
Hi,
I have the same problem with you. Have you solved this problem?
Could you tell me how to solve this problem?
Thank you.
在 2017年5月19日星期五 UTC+8上午1:37:22,mo…@appsflyer.com写道:
Hi Linjing,
We never really solved it. I just checked and two of our clusters have an even distribution of data on the historical nodes, but on one cluster the nodes vary by ~20% in how much data they are holding.
Best,
Morri