io.druid.cli.CliPeon fail to start

druid version: I have tried 0.9.1.2 and 0.9.3rc which are recompiled with dependency of hadoop 2.6.0, both of them failed with same error.

Host env: CDH5.6.0 hadoop2.6.0

jdk: 1.7.0_67

apache slider version: 0.91.0

Recently I have started to run druid on yarn with help of framework apache slider. Everything works just fine until I started to run data loading with mapreduce.

The mapreduce job on yarn works successfully but I encounter a strange problem of druid below. I have imporove PermGen setting below and I check the heap with jmap -heap command. The usage of Perm Generation is lower than 30% after I improve the PermGen setting. So this is really strange. Could anyone offer me some advice? Thx. :slight_smile:

java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: PermGen space

at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]

at io.druid.indexing.worker.executor.ExecutorLifecycle.join(ExecutorLifecycle.java:211) ~[druid-indexing-service-0.9.1.2-SNAPSHOT.jar:0.9.1.2-SNAPSHOT]

at io.druid.cli.CliPeon.run(CliPeon.java:287) [druid-services-0.9.1.2-SNAPSHOT.jar:0.9.1.2-SNAPSHOT]

at io.druid.cli.Main.main(Main.java:105) [druid-services-0.9.1.2-SNAPSHOT.jar:0.9.1.2-SNAPSHOT]

Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: PermGen space

at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) ~[guava-16.0.1.jar:?]

at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) ~[guava-16.0.1.jar:?]

at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[guava-16.0.1.jar:?]

at io.druid.indexing.worker.executor.ExecutorLifecycle.join(ExecutorLifecycle.java:208) ~[druid-indexing-service-0.9.1.2-SNAPSHOT.jar:0.9.1.2-SNAPSHOT]

… 2 more

Caused by: java.lang.OutOfMemoryError: PermGen space

at java.lang.ClassLoader.defineClass1(Native Method) ~[?:1.7.0_67]

at java.lang.ClassLoader.defineClass(ClassLoader.java:800) ~[?:1.7.0_67]

at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) ~[?:1.7.0_67]

at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) ~[?:1.7.0_67]

at java.net.URLClassLoader.access$100(URLClassLoader.java:71) ~[?:1.7.0_67]

at java.net.URLClassLoader$1.run(URLClassLoader.java:361) ~[?:1.7.0_67]

at java.net.URLClassLoader$1.run(URLClassLoader.java:355) ~[?:1.7.0_67]

at java.security.AccessController.doPrivileged(Native Method) ~[?:1.7.0_67]

at java.net.URLClassLoader.findClass(URLClassLoader.java:354) ~[?:1.7.0_67]

at java.lang.ClassLoader.loadClass(ClassLoader.java:425) ~[?:1.7.0_67]

at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) ~[?:1.7.0_67]

at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ~[?:1.7.0_67]

at java.lang.Class.forName0(Native Method) ~[?:1.7.0_67]

at java.lang.Class.forName(Class.java:190) ~[?:1.7.0_67]

at org.apache.logging.log4j.util.LoaderUtil.loadClass(LoaderUtil.java:122) ~[log4j-api-2.5.jar:2.5]

at org.apache.logging.log4j.core.util.Loader.loadClass(Loader.java:228) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.impl.ThrowableProxy.loadClass(ThrowableProxy.java:496) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.impl.ThrowableProxy.toExtendedStackTrace(ThrowableProxy.java:617) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.impl.ThrowableProxy.(ThrowableProxy.java:163) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.impl.ThrowableProxy.(ThrowableProxy.java:165) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.impl.ThrowableProxy.(ThrowableProxy.java:138) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.impl.ThrowableProxy.(ThrowableProxy.java:117) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.impl.Log4jLogEvent.getThrownProxy(Log4jLogEvent.java:482) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.pattern.ExtendedThrowablePatternConverter.format(ExtendedThrowablePatternConverter.java:64) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.pattern.PatternFormatter.format(PatternFormatter.java:36) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.layout.PatternLayout$PatternSerializer.toSerializable(PatternLayout.java:292) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.layout.PatternLayout.toSerializable(PatternLayout.java:206) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.layout.PatternLayout.toSerializable(PatternLayout.java:56) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.layout.AbstractStringLayout.toByteArray(AbstractStringLayout.java:148) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:112) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) ~[log4j-core-2.5.jar:2.5]

at org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) ~[log4j-core-2.5.jar:2.5]

2016-10-28T07:34:44,466 INFO [Thread-30] io.druid.cli.CliPeon - Running shutdown hook

Exception in thread “Thread-30” java.lang.OutOfMemoryError: PermGen space

at java.lang.ClassLoader.defineClass1(Native Method)

at java.lang.ClassLoader.defineClass(ClassLoader.java:800)

at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)

at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)

at java.net.URLClassLoader.access$100(URLClassLoader.java:71)

at java.net.URLClassLoader$1.run(URLClassLoader.java:361)

at java.net.URLClassLoader$1.run(URLClassLoader.java:355)

at java.security.AccessController.doPrivileged(Native Method)

at java.net.URLClassLoader.findClass(URLClassLoader.java:354)

at java.lang.ClassLoader.loadClass(ClassLoader.java:425)

at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)

at java.lang.ClassLoader.loadClass(ClassLoader.java:358)

at com.google.common.collect.Lists.reverse(Lists.java:754)

at com.metamx.common.lifecycle.Lifecycle.stop(Lifecycle.java:274)

at io.druid.cli.CliPeon$2.run(CliPeon.java:282)

at java.lang.Thread.run(Thread.java:745)

2016-10-28T07:34:45,620 WARN [Thread-3] org.apache.hadoop.util.ShutdownHookManager - ShutdownHook ‘ClientFinalizer’ failed, java.lang.OutOfMemoryError: PermGen space

java.lang.OutOfMemoryError: PermGen space

my configs:

./middleManager/jvm.config-server

-Xmx1024m

-Xms1024m

-XX:+UseConcMarkSweepGC

-XX:+PrintGCDetails

-XX:+PrintGCTimeStamps

-XX:PermSize=256m

-XX:MaxPermSize=256m

-Duser.timezone=UTC

-Dfile.encoding=UTF-8

-Djava.io.tmpdir=var/tmp

-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

./broker/jvm.config-server

-Xms4g

-Xmx4g

-Xms24g

-Xmx24g

-XX:MaxDirectMemorySize=4096m

-XX:PermSize=256m

-XX:MaxPermSize=256m

-XX:+UseConcMarkSweepGC

-XX:+PrintGCDetails

-XX:+PrintGCTimeStamps

-Duser.timezone=UTC

-Dfile.encoding=UTF-8

-Djava.io.tmpdir=var/tmp

-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

./coordinator/jvm.config-server

-Xms3g

-Xmx3g

-XX:NewSize=256m

-XX:MaxNewSize=256m

-XX:+UseG1GC

-XX:+PrintGCDetails

-XX:+PrintGCTimeStamps

-XX:PermSize=256m

-XX:MaxPermSize=256m

-Dfile.encoding=UTF-8

-Djava.io.tmpdir=var/tmp

-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

-Dderby.stream.error.file=var/druid/derby.log

./historical/jvm.config-server

-Xms8g

-Xmx8g

-XX:MaxDirectMemorySize=4096m

-XX:PermSize=256m

-XX:MaxPermSize=256m

-XX:+UseConcMarkSweepGC

-XX:+PrintGCDetails

-XX:+PrintGCTimeStamps

-Duser.timezone=UTC

-Dfile.encoding=UTF-8

-Djava.io.tmpdir=var/tmp

-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

./overlord/jvm.config-server

-Xms4g

-Xmx4g

-XX:NewSize=256m

-XX:MaxNewSize=256m

-XX:+UseConcMarkSweepGC

-XX:+PrintGCDetails

-XX:+PrintGCTimeStamps

-XX:PermSize=256m

-XX:MaxPermSize=256m

-Duser.timezone=UTC

-Dfile.encoding=UTF-8

-Djava.io.tmpdir=var/tmp

-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

Hey Zhaorong,

Setting PermGen in the middle manager will not by default be inherited by the peon which is where your OOM is happening. Try setting the following in your middle manager’s runtime.properties:

druid.indexer.runner.javaOpts=-XX:PermSize=256m -XX:MaxPermSize=256m

(or add the parameters to whatever you already have there for druid.indexer.runner.javaOpts)

Oh, it works! David, Thx so much!! :- )