Can someone explain my tasks are failing?

Hi All,

My tasks are failing due to insufficient memory. But I’m sure, at that time, enough memory is available to run the task. Please help me to find the issue. I attached my middle manager and historical node configurations here as well.

Machine Specs:
EC 2 Instance
R3.2XLarge
Memory: 60GB
CPU: 8VCPU

CPU & Memory Usage:

Please help me to correct my configurations if it is in appropriate ?

Thanks in advance.

coordinator.jvm (225 Bytes)

coordinator.runtime (114 Bytes)

middlemanager.main_config (182 Bytes)

middlemanager.runtime (790 Bytes)

druid_task_log.txt (4.68 MB)

It seems like the Middle Manager ran out of Disk space, it cannot write anymore. Check the druid_task_log, and look at the exceptions.

Exception in thread "plumber_persist_0" java.lang.RuntimeException: java.io.IOException: No space left on device
        at com.google.common.base.Throwables.propagate(Throwables.java:160)
        at io.druid.segment.realtime.plumber.RealtimePlumber.persistHydrant(RealtimePlumber.java:950)
        at io.druid.segment.realtime.plumber.RealtimePlumber$1.doRun(RealtimePlumber.java:320)
        at io.druid.common.guava.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: No space left on device
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:326)
        at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
        at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:316)
        at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:149)
        at java.io.OutputStreamWriter.close(OutputStreamWriter.java:233)
        at java.io.BufferedWriter.close(BufferedWriter.java:266)
        at com.google.common.io.Closeables.close(Closeables.java:77)
        at com.metamx.common.io.smoosh.FileSmoosher.close(FileSmoosher.java:240)
        at com.metamx.common.io.smoosh.Smoosh.smoosh(Smoosh.java:66)
        at io.druid.segment.IndexMerger.makeIndexFiles(IndexMerger.java:877)
        at io.druid.segment.IndexMerger.merge(IndexMerger.java:438)
        at io.druid.segment.IndexMerger.persist(IndexMerger.java:186)
        at io.druid.segment.IndexMerger.persist(IndexMerger.java:152)
        at io.druid.segment.realtime.plumber.RealtimePlumber.persistHydrant(RealtimePlumber.java:929)
        ... 5 more

``

@Carlos,

Thanks for the response. My logs are piling up so fast. 22GB in a day. However solved it.

Hey, can you check the configurations of my nodes? I suspect something wrong with them. My machines are always under utilised. Is that okay if I reduce Jmx and Jms values other than the recommended ?