Hi,
I have been experiencing OOMEs constantly on historical nodes:
2017-Nov-02 07:02:55 AM [processing-5] ERROR com.google.common.util.concurrent.Futures$CombinedFuture - input future failed.
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.nio.DirectByteBufferR.duplicate(DirectByteBufferR.java:217) ~[?:1.8.0_73]
at java.nio.DirectByteBufferR.asReadOnlyBuffer(DirectByteBufferR.java:234) ~[?:1.8.0_73]
at io.druid.query.aggregation.hyperloglog.HyperUniquesSerde$3.fromByteBuffer(HyperUniquesSerde.java:123) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.query.aggregation.hyperloglog.HyperUniquesSerde$3.fromByteBuffer(HyperUniquesSerde.java:113) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.segment.data.GenericIndexed$BufferIndexed._get(GenericIndexed.java:537) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.segment.data.GenericIndexed$2.get(GenericIndexed.java:158) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.segment.data.GenericIndexed.get(GenericIndexed.java:395) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.segment.column.IndexedComplexColumn.getRowValue(IndexedComplexColumn.java:53) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.segment.QueryableIndexStorageAdapter$CursorSequenceBuilder$1$1QueryableIndexBaseCursor$8.get(QueryableIndexStorageAdapter.java:883) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.query.select.SelectQueryEngine.singleEvent(SelectQueryEngine.java:297) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.query.select.SelectQueryEngine$1.apply(SelectQueryEngine.java:252) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.query.select.SelectQueryEngine$1.apply(SelectQueryEngine.java:215) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.query.QueryRunnerHelper$1.apply(QueryRunnerHelper.java:68) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.query.QueryRunnerHelper$1.apply(QueryRunnerHelper.java:63) ~[druid-processing-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.MappingAccumulator.accumulate(MappingAccumulator.java:42) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.FilteringAccumulator.accumulate(FilteringAccumulator.java:43) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.MappingAccumulator.accumulate(MappingAccumulator.java:42) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.BaseSequence.accumulate(BaseSequence.java:46) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.MappedSequence.accumulate(MappedSequence.java:43) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.FilteredSequence.accumulate(FilteredSequence.java:45) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.MappedSequence.accumulate(MappedSequence.java:43) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.FilteredSequence.accumulate(FilteredSequence.java:45) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.LazySequence.accumulate(LazySequence.java:40) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[java-util-0.10.0.jar:0.10.0]
at io.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[java-util-0.10.0.jar:0.10.0]
And sometimes I got exception like:
2017-Nov-02 10:05:19 AM [qtp1829194516-44] ERROR com.sun.jersey.spi.container.ContainerResponse - The exception contained within MappableContainerException could not be mapped to a response, re-throwing to the HTTP container
java.lang.OutOfMemoryError: GC overhead limit exceeded
The server spec is 32GB memory and 8 CPUs and 480GB SSD.
The JVM and runtime configs are as follows:
JVM:
-Xms6g
-Xmx6g
-XX:MaxDirectMemorySize=15g
runtime:
druid.service=druid/historical
druid.port=8083
HTTP server threads
druid.server.http.numThreads=25
Processing threads and buffers
druid.processing.buffer.sizeBytes=1073741824
druid.processing.numThreads=7
Segment storage
druid.segmentCache.locations=[{“path”:“var/druid/segment-cache”,“maxSize”:260000000000}]
druid.server.maxSize=250000000000
Query cache
druid.broker.cache.useCache=true
druid.broker.cache.populateCache=true
druid.broker.cache.unCacheable=
GroupBy
druid.query.groupBy.maxMergingDictionarySize=8000000000
druid.query.groupBy.maxOnDiskStorage=32000000000
druid.query.groupBy.maxIntermediateRows=2000000000
druid.query.groupBy.maxResults=2000000000
I think the memory on the node is enough.
I did some rough calculation: 7 * 1073741824 + (15 + 6) * 1024 * 1024 * 1024 < 32 * 1024 * 1024 * 1024
Also, I checked with wc -l /proc/5353/maps
, which shows only 1700+.
I think we didn’t reach to point where we need to adjust /proc/sys/vm/max_map_count
So I wonder what would be the cause of this OOME.
Thanks in advance!