Druid middle manager GC errors (?)

Hi, can someone potentially provide some light on this issue. It would seem our middle managers are crashing in our environment with “FULL GC” errors in their logs.

Unfortunately, we term’ed the boxes when they crashed so we don’t have the exceptions showing the Full GC allocation exception. However on restart with extra heap for the task runners in druid (-xmx 3g) I see this:

2/8/2017 3:58:23 PM48.308: [GC (Allocation Failure) 48.308: [ParNew: 19648K->2175K(19648K), 0.1016910 secs] 49577K->33701K(63360K), 0.1017607 secs] [Times: user=2.80 sys=0.00, real=0.10 secs]

2/8/2017 3:58:23 PM49.291: [GC (Allocation Failure) 49.292: [ParNew: 19647K->1892K(19648K), 0.0371262 secs] 51173K->34116K(63360K), 0.0372057 secs] [Times: user=0.78 sys=0.00, real=0.04 secs]

2/8/2017 4:01:22 PM228.188: [GC (Allocation Failure) 228.189: [ParNew: 19364K->1560K(19648K), 0.0720603 secs] 51588K->34299K(63360K), 0.0721765 secs] [Times: user=1.58 sys=0.02, real=0.07 secs]

2/8/2017 4:04:48 PM433.506: [GC (Allocation Failure) 433.506: [ParNew: 19032K->1361K(19648K), 0.0524428 secs] 51771K->34507K(63360K), 0.0525451 secs] [Times: user=1.36 sys

Our middle manager jvm.config is:

-server

-Xmx64m

-Xms64m

-XX:+UseConcMarkSweepGC

-XX:+PrintGCDetails

-XX:+PrintGCTimeStamps

-Duser.timezone=UTC

-Dfile.encoding=UTF-8

-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

-Djava.io.tmpdir=/mnt/tmp

The runtime.properties configuration for our middle managers is:

druid.service=druid/middleManager

druid.host=HOST_IP

druid.port=8091

# Number of tasks per middleManager

druid.worker.capacity=2

# Task launch parameters

druid.indexer.runner.javaOpts=-server -Xmx3g -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps

druid.indexer.task.baseTaskDir=/mnt/druid/task

# Peon properties

druid.indexer.fork.property.druid.processing.buffer.sizeBytes=536870912

druid.indexer.fork.property.druid.processing.numThreads=2

druid.indexer.fork.property.druid.segmentCache.locations=[{"path": "/mnt/druid/segment_cache", "maxSize": 0}]

druid.indexer.fork.property.druid.server.http.numThreads=40

# Hadoop indexing

druid.indexer.task.hadoopWorkingPath=/mnt/druid/hadoop-tmp

druid.indexer.task.defaultHadoopCoordinates=["org.apache.hadoop:hadoop-client:2.3.0"]

Full GC is not an error it is ok to see this sometimes in the logs. Having the entire logs will help debugging what is crashing the nodes.