Middle Manager workers not being assigned to full capacity.

Hello,

I have 2 middle manager nodes, 1 with a capacity of 3 workers and another one with a capacity of 15. I assigned 9 indexing tasks, and only 6 are being processed, 3 per each Middle Manager node. I verified that the middle manager has workers not being used.

Is an heterogeneous worker configuration for middle managers not supported? Otherwise, what could I have misconfigured?

I think I found the cause, it was caused by using a week machine : m4.large. I just tested this scala code on a single virtual core machine (t2.micro) :

tasks.map { task =>
Future(index(task))
}

and the outcome is basically :

tasks.map { task =>
index(task)
}
which is surprising to me, I thought that there should be some context switching applied and that it would let the blocking code run by multiple threads, just not concurrently.

So the count of indexing tasks was always limited by # of virtual cores of the machines I was using, which was 1,2,4 and 8 :slight_smile:

When i started the indexing code on a m4.2xlarge machine it was issuing 8 requests in parallel …

I’m using Futures for 7 years now and I was sure that even on a single virtual core machine they would run in parallel, just not concurrently.

Discovery of the day !!!

So the solution is using ExecutionContext with unlimited number of threads in its threadpool … on the client application…

My fault. Thanks !!!

Omg I’m answering to a different, similar thread, see :

https://groups.google.com/forum/#!topic/druid-user/lxY7fWGE03M

No worries Jakub. It happens.

Carlos, the indexing service has locks to prevent tasks from overriding each other and writing the same pieces of data. Can you verify that your data does not overlap in any way?

Hello,

I verified that my data is different all around. I am going to try to reproduce the configuration again.