Updating to druid 0.10 MiddleManager Peons failing with OutOfMemory error

Trying to update from druid 0.9.2 to 0.10.0 and ran into a few memory issues but was able to resolve them by upping the memory limits however...


Currently running into an issue trying to run indexing tasks... (out realtime tasks are just fine). with the peon working going OOM.
017-08-19T15:49:39,425 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Uncaught Throwable while running task[IndexTask{id=sand_hourly_deals_ Hourly Cube_2017-08-19_H1_retry_3_at_2017-08-19T15-45-24, type=index, dataSource=deals_sand_v2}]
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at java.util.ArrayList.<init>(ArrayList.java:152) ~[?:1.8.0_101]
        at com.google.common.collect.Lists.newArrayListWithCapacity(Lists.java:175) ~[guava-16.0.1.jar:?]
        at io.druid.indexing.common.task.IndexTask.determineShardSpecs(IndexTask.java:321) ~[druid-indexing-service-0.10.0.jar:0.10.0]
        at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:185) ~[druid-indexing-service-0.10.0.jar:0.10.0]
        at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.10.0.jar:0.10.0]
        at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.10.0.jar:0.10.0]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_101]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]
2017-08-19T15:49:39,429 ERROR [main] io.druid.cli.CliPeon - Error when starting up.  Failing.
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
        at io.druid.indexing.worker.executor.ExecutorLifecycle.join(ExecutorLifecycle.java:212) ~[druid-indexing-service-0.10.0.jar:0.10.0]
        at io.druid.cli.CliPeon.run(CliPeon.java:290) [druid-services-0.10.0.jar:0.10.0]
        at io.druid.cli.Main.main(Main.java:108) [druid-services-0.10.0.jar:0.10.0]
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) ~[guava-16.0.1.jar:?]
        at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) ~[guava-16.0.1.jar:?]
        at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[guava-16.0.1.jar:?]
        at io.druid.indexing.worker.executor.ExecutorLifecycle.join(ExecutorLifecycle.java:209) ~[druid-indexing-service-0.10.0.jar:0.10.0]
        ... 2 more
Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at java.util.ArrayList.<init>(ArrayList.java:152) ~[?:1.8.0_101]
        at com.google.common.collect.Lists.newArrayListWithCapacity(Lists.java:175) ~[guava-16.0.1.jar:?]
        at io.druid.indexing.common.task.IndexTask.determineShardSpecs(IndexTask.java:321) ~[druid-indexing-service-0.10.0.jar:0.10.0]
        at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:185) ~[druid-indexing-service-0.10.0.jar:0.10.0]
        at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) ~[druid-indexing-service-0.10.0.jar:0.10.0]
        at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) ~[druid-indexing-service-0.10.0.jar:0.10.0]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_101]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_101]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_101]
        at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_101]
2017-08-19T15:49:39,430 INFO [Thread-51] io.druid.cli.CliPeon - Running shutdown hook

``

The middlemanager is a r4.2xlarge with the following peon config:

# Resources for peons
druid.indexer.runner.javaOpts=-server -Xmx8g -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
druid.indexer.task.baseTaskDir=/mnt/druid/task/

# Peon properties
druid.indexer.fork.property.druid.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]
druid.indexer.fork.property.druid.processing.buffer.sizeBytes=536870912

druid.indexer.fork.property.druid.processing.numThreads=2
druid.indexer.fork.property.druid.segmentCache.locations=[{"path": "/mnt/druid/zk_druid", "maxSize": 0}]
druid.indexer.fork.property.druid.server.http.numThreads=50
druid.indexer.fork.property.druid.storage.archiveBaseKey=sand
druid.indexer.fork.property.druid.storage.archiveBucket=sand-druid.tld.com
druid.indexer.fork.property.druid.storage.baseKey=sand/v1
druid.indexer.fork.property.druid.storage.bucket=sand-druid.tld.com
druid.indexer.fork.property.druid.storage.type=s3

druid.worker.capacity=9

Any help appreciated!

Hi - Can you tell me how much RAM is available on your machine exactly ?

druid.indexer.runner.javaOpts applied to each of the peon you have and in your case you have asked for 9 peons (druid.worker.capacity). So you need 8GB*9 = 72 GB available memory.

8GB calculation goes like this if you are using 0.10 now

buffer memory (536870912) * [(druid.indexer.fork.property.druid.processing.numThreads)*(default merger buffer - 2) + 1]

536870912*[2+2+1] should be less than the memory you have given in javaOpts which is 8GB. Looking at your configuration it comes to 3-3.5GB per peon and you have given 8GB.