Peons performance tunning to reduce number of threads

Hello everyone,

I am trying to tune up the peon memory config. I have 40-cores machine where I encounter “OOM could not create a native thread”, when there is more than 25 running peons.

I think this is related to the high number of threads peons spawns. In my application it is about 250 threads per one peon process, when this number gets close 6k the OOMs starts to occur. Most of these threads have very short life span:

$ ps -U druid -T | head

PID SPID TTY TIME CMD

23460 23460 ? 00:00:00 gpg-agent

23486 23486 ? 00:00:00 ssh-agent

28089 28089 ? 00:00:00 java

28089 28183 ? 00:00:04 java

28089 28184 ? 00:00:00 java

28089 28185 ? 00:00:00 java

28089 28186 ? 00:00:00 java

28089 28187 ? 00:00:00 java

28089 28188 ? 00:00:00 java

I checked and my systems limits are much higher then 6k:

$ ulimit -u

1029071

I wonder if that amount of threads is normal and how to tune up my system. Should I target to reduce the number of threads or increase the limit my application is able handle, and also what settings I should look into.

I also have a side question. I’ve seen that it is recommended to run number of cores -1 number of workers, which makes sense. But I can see that in my application cores are idle for most of the time. Can I increase the number of the workers to higher values If I want to increase indexing power of my machine and use my resources more efficiently?

Thank you

Adam

You might have a very high druid.processing.numThreads (250?) more than the peon process can support. The # of workers is also # of cores -1 similar to the threads. You can increase the workers if you see there are threads not being used. Check as well your JVM settings for MM. Rule of thumb is = 256mb * druid.processing.numThreads = 10GB if using 39 threads. If you are using 250 threads, that’s 64GB of heap.

Rommel Garcia

That was exactly the case! So changing this and also druid.indexer.fork.property.druid.server.http.numThreads to lower values, was the first thing I have tried. But I still see ~ 250 threads per process with ps -U druid -T command. I did this on a node where middleManager runs. Maybe it needs to be changed somewhere else, on druid-coordinator node? Or is it stored in zookeeper?

Still don’t know what causes me to have 250 threads per peon, but I just found why I wasn’t able to run more then 6k threads. I was running my service via systemd and I encountered the pids limit:

$ cat /sys/fs/cgroup/pids/system.slice/druid-indexer.service/pids.max

6143

Hi Adam,

Can you check what are the values in your config for the below properties ?

druid.server.http.numThreads

druid.processing.numThreads

druid.indexer.fork.property.druid.processing.numThreads

druid.indexer.fork.property.druid.server.http.numThreads

Thanks,

Sashi

Hello Sashi, sure! I am now using following settings:

druid.server.http.numThreads = 50

druid.processing.numThreads = 6

druid.indexer.fork.property.druid.processing.numThreads = 4

druid.indexer.fork.property.druid.server.http.numThreads = 50

Hello Adam,

Does each peon still have 250 threads !

Try removing all the fork properties as these are child peons explicitly defining their resources. All the properties from MiddleManager will be inherited by the peons.

Rommel Garcia
Director, Field Engineering
rommel.garcia@imply.io
404.502.9672