Middle Manager CPU at 100% and running tasks get stuck

Hi everyone,

Lately some of our Hadoop indexing tasks have been getting stuck. We believe this is happening because of the Middle Manager’s CPU peaking and staying at 100%.

The CPU in these cases is used up by the java processes that the Middle Manager spawns for each task (these are what are called peons, right?). So, not fully knowing what to do, we’ve been killing these processes manually, the task is then shut down, the CPU goes back down and everything is back to normal…until it happens again.

We haven’t been able to see what is causing this. The Middle Manager logs don’t show any errors and we haven’t been able to reproduce this issue by ourselves.

What can we do to solve this? Do we need to perhaps add more Middle Managers to our cluster?

Thanks,

Joan

Hi,

If nobody’s run into this issue, at least I’d like to know how adding more Middle Managers could affect the whole system.

Thanks,

Joan

Hi Joan,
For batch index tasks, the peon just submits a MR job and wait for it to complete. the peon itself does not ingest data and should not need much CPU.

Any info in the logs around the time when tasks are stuck ? Can you take a few thread dumps and check where its stuck ?

Hi Nishant,

I killed the peons and restarted the Middle Manager service and for a couple of days it hasn’t been consuming so much CPU. If it happens again I’ll check the thread dumps and the logs. If the peons only submit tasks to the MR job, then I’m guessing that adding more Middle Managers wouldn’t be very helpful.

However, today I stumbled upon something that may be related:

I launched a second Druid cluster, identical to the one we have in production and checked the java processes running in both:

Production:

root 1682 0.1 4.3 8930772 1348328 ? Sl Mar06 17:26 java -server -Xms3g -Xmx3g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=var/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -cp /opt/druid-0.9.1.1/conf/druid/_common:/opt/druid-0.9.1.1/conf/druid/overlord:/opt/druid-0.9.1.1/lib/* io.druid.cli.Main server overlord

root 2653 0.1 4.3 8935212 1346468 ? Sl Jan30 78:33 java -server -Xms3g -Xmx3g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=var/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -cp /opt/druid-0.9.1.1/conf/druid/_common:/opt/druid-0.9.1.1/conf/druid/overlord:/opt/druid-0.9.1.1/lib/* io.druid.cli.Main server overlord

root 20130 0.1 4.6 8943152 1451188 ? Sl Feb23 57:03 java -server -Xms3g -Xmx3g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=var/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dderby.stream.error.file=var/druid/derby.log -cp /opt/druid-0.9.1.1/conf/druid/_common:/opt/druid-0.9.1.1/conf/druid/coordinator:/opt/druid-0.9.1.1/lib/* io.druid.cli.Main server coordinator

Second cluster:

root 1696 0.1 3.3 8935456 1049712 ? Sl 08:54 0:56 java -server -Xms3g -Xmx3g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=var/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dderby.stream.error.file=var/druid/derby.log -cp /opt/druid-0.9.1.1/conf/druid/_common:/opt/druid-0.9.1.1/conf/druid/coordinator:/opt/druid-0.9.1.1/lib/* io.druid.cli.Main server coordinator

root 1705 0.2 2.9 8928836 923308 ? Sl 08:54 1:13 java -server -Xms3g -Xmx3g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=var/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -cp /opt/druid-0.9.1.1/conf/druid/_common:/opt/druid-0.9.1.1/conf/druid/overlord:/opt/druid-0.9.1.1/lib/* io.druid.cli.Main server overlord

There seems to be a second overlord process running that doesn’t make much sense to me. Could it be that it went stray after some failed task and somehow making the system unstable? Could this be related to the Middle Manager’s cpu increase?

Thanks!

Joan

Hi Nishant,

The Middle Manager’s cpu is going to 100% once again. It’s the peon process which is using Xmx 2G. I’ve took a few thread dumps but cannot see anything at plain sight. It’d be great if you guys could take a look and see if there’s anything which seems strange.

Thanks!

stacks (67.4 KB)

Any ideas here?

Thanks!

Joan

Any news guys?

Thanks!

Joan

Hi Joan,
Nothing seems to be odd in the thread dumps, The believe peon should not be the one causing CPU spike.

Did you also try running ‘top’ to see which process is actually hogging the cpu ? Maybe its some other periodic cron job in your system that might be causing this ?

Hi Nishant,

Thanks for taking a look into this. Yes, we’ve always run ‘top’ and the process which is hogging is always one of the peons that have been spawned…

Let me give you some more detail about when this happens just in case something sounds familiar:

The CPU goes to 100% with tasks that take longer than our usual tasks. This is obviously because there’s a lot more data to be indexed. There almost seems there’s a threshold for which the peons say “The task is taking too long in Hadoop, let’s all start asking for it much more frequently that as usual”.

Now, at this point, the task has finished both jobs in Hadoop (determine_partitions, and index_generator) without any issues, but the task is still running, as it can be seen in the Overlord Console, and the peon process is still alive, consuming as much CPU as possible.

We still haven’t been able to solve this, and can’t find anyone who’s had any similar issues, so any help is greatly appreciated.

Joan

Its wierd, havn’t seen this before.
have you also tried CPU profiling of the task to see what is actually consuming the CPU ? maybe that would provide some more insight.

Hi Nishant,

Thank you for taking a look into this.

We have been trying to run a CPU profiling of the peon but we are facing garbage collector issues. Being more specific we are able to run the CPU profiling until we get this GC error.

java.rmi.ServerError: Error occurred in server thread; nested exception is:

java.lang.OutOfMemoryError: GC overhead limit exceeded

at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:374)

at sun.rmi.transport.Transport$1.run(Transport.java:200)

at sun.rmi.transport.Transport$1.run(Transport.java:197)

at java.security.AccessController.doPrivileged(Native Method)

at sun.rmi.transport.Transport.serviceCall(Transport.java:196)

at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)

at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)

at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)

at java.security.AccessController.doPrivileged(Native Method)

at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:276)

at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:253)

at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:162)

at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)

at javax.management.remote.rmi.RMIConnectionImpl_Stub.isRegistered(Unknown Source)

at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.isRegistered(RMIConnector.java:864)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at com.jvmtop.openjdk.tools.ProxyClient$SnapshotInvocationHandler.invoke(Unknown Source)

at com.sun.proxy.$Proxy1.isRegistered(Unknown Source)

at com.jvmtop.openjdk.tools.ProxyClient.tryConnect(Unknown Source)

at com.jvmtop.openjdk.tools.ProxyClient.connect(Unknown Source)

at com.jvmtop.monitor.VMInfo.attachToVM(Unknown Source)

at com.jvmtop.monitor.VMInfo.processNewVM(Unknown Source)

at com.jvmtop.view.VMProfileView.(Unknown Source)

at com.jvmtop.JvmTop.main(Unknown Source)

Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded

Exception in thread “main” java.lang.NullPointerException

at com.jvmtop.profiler.CPUSampler.(Unknown Source)

at com.jvmtop.view.VMProfileView.(Unknown Source)

at com.jvmtop.JvmTop.main(Unknown Source)

Since it looks like an application issue we tried to take a deep look into the process memory and garbage collector:

$ jmap -histo 11403

num #instances #bytes class name

gc.txt (1000 KB)

Hi guys,

Have you been able to seen any issue in the profiling we attached? Lately what we did to avoid this problem was send a whole lot less data…what seems to be the solution is to scale the cluster (or change the config), but we are at a loss to know where. If the middle manager only sends and received tasks from Hadoop, then this node isn’t the one we need to scale. Perhaps the MapReduce cluster?

Any help is greatly appreciated,

Joan

Hi guys, have you been able to take a look into this issue?

Thanks!

Joan

Hi guys, I’m facing similar issue, getting my CPU Load Average (LA) suddenly to ~58 on 2-core machine. I used tranquility for sending realtime event. This happened after several failing request from tranquility (marked by this part on the response: "sent":0). Currently, I directly stop druid processes and the LA got normal again. However, I have no idea what the root cause is. I’ll appreciate if anyone could help.