Exception while pulling item from cache : Memcache

Hi,

I am using memcache for caching in druid.I have enabled useCache on broker and populate and useCache on historical node.

I have created a druid cluster with two broker node and two historical nodes along with other services.And broker nodes are behind a load balancer.

But while querying data we are getting this exception both in broker and historical node logs.

io.druid.client.cache.MemcachedCache - Exception pulling item from cache.

Stack trace :

2016-12-29T07:31:46,648 INFO [main] org.eclipse.jetty.server.handler.ContextHandler - Started o.e.j.s.ServletContextHandler@781711b7{/,null,AVAILABLE}

2016-12-29T07:31:46,653 INFO [main] org.eclipse.jetty.server.ServerConnector - Started ServerConnector@4dd90166{HTTP/1.1}{0.0.0.0:8083}

2016-12-29T07:31:46,653 INFO [main] org.eclipse.jetty.server.Server - Started @4119ms

2016-12-29T07:31:46,654 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking start method[public void io.druid.server.log.FileRequestLogger.start()] on object[io.druid.server.log.FileRequestLogger@4ed5a1b0].

2016-12-29T07:31:46,657 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking start method[public void io.druid.query.lookup.LookupReferencesManager.start()] on object[io.druid.query.lookup.LookupReferencesManager@79f90a3a].

2016-12-29T07:31:46,657 INFO [main] io.druid.query.lookup.LookupReferencesManager - Started lookup factory references manager

2016-12-29T07:31:46,657 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking start method[public void io.druid.server.listener.announcer.ListenerResourceAnnouncer.start()] on object[io.druid.query.lookup.LookupResourceListenerAnnouncer@22bdb1d0].

2016-12-29T07:31:46,662 INFO [main] io.druid.server.listener.announcer.ListenerResourceAnnouncer - Announcing start time on [/druid/prod/listeners/lookups/__default/10.74.86.139:8083]

2016-12-29T07:31:46,662 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking start method[public void io.druid.server.coordination.AbstractDataSegmentAnnouncer.start()] on object[io.druid.server.coordination.BatchDataSegmentAnnouncer@3b83459e].

2016-12-29T07:31:46,663 INFO [main] io.druid.server.coordination.AbstractDataSegmentAnnouncer - Announcing self[DruidServerMetadata{name=‘10.74.86.139:8083’, host=‘10.74.86.139:8083’, maxSize=21207667507, tier=’_default_tier’, type=‘historical’, priority=‘0’}] at [/druid/prod/announcements/10.74.86.139:8083]

CMS: abort preclean due to time 7.285: [CMS-concurrent-abortable-preclean: 2.634/5.047 secs] [Times: user=9.10 sys=0.33, real=5.04 secs]

7.286: [GC (CMS Final Remark) [YG occupancy: 1121063 K (5662336 K)]7.286: [Rescan (parallel) , 0.0242922 secs]7.310: [weak refs processing, 0.0000240 secs]7.310: [class unloading, 0.0066592 secs]7.317: [scrub symbol table, 0.0067175 secs]7.324: [scrub string table, 0.0006129 secs][1 CMS-remark: 0K(6291456K)] 1121063K(11953792K), 0.0398984 secs] [Times: user=0.18 sys=0.00, real=0.04 secs]

7.326: [CMS-concurrent-sweep-start]

7.326: [CMS-concurrent-sweep: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

7.326: [CMS-concurrent-reset-start]

7.351: [CMS-concurrent-reset: 0.025/0.025 secs] [Times: user=0.01 sys=0.02, real=0.03 secs]

2016-12-29T07:38:57,105 WARN [timeBoundary_CANCEL_[2016-12-04T00:00:00.000Z/2016-12-05T00:00:00.000Z]] io.druid.client.cache.MemcachedCache - Exception pulling item from cache

java.util.concurrent.ExecutionException: java.util.concurrent.CancellationException: Cancelled

at net.spy.memcached.internal.OperationFuture.get(OperationFuture.java:177) ~[spymemcached-2.11.7.jar:2.11.7]

at net.spy.memcached.internal.GetFuture.get(GetFuture.java:69) ~[spymemcached-2.11.7.jar:2.11.7]

at io.druid.client.cache.MemcachedCache.get(MemcachedCache.java:453) [druid-server-0.9.2.jar:0.9.2]

at io.druid.client.CachingQueryRunner.run(CachingQueryRunner.java:114) [druid-server-0.9.2.jar:0.9.2]

at io.druid.query.BySegmentQueryRunner.run(BySegmentQueryRunner.java:70) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.MetricsEmittingQueryRunner$1.accumulate(MetricsEmittingQueryRunner.java:104) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner$2$1.call(SpecificSegmentQueryRunner.java:87) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner.doNamed(SpecificSegmentQueryRunner.java:171) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner.access$400(SpecificSegmentQueryRunner.java:41) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner$2.doItNamed(SpecificSegmentQueryRunner.java:162) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner$2.accumulate(SpecificSegmentQueryRunner.java:80) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.CPUTimeMetricQueryRunner$1.accumulate(CPUTimeMetricQueryRunner.java:81) [druid-processing-0.9.2.jar:0.9.2]

at com.metamx.common.guava.Sequences$1.accumulate(Sequences.java:90) [java-util-0.27.10.jar:?]

at com.metamx.common.guava.Sequences.toList(Sequences.java:113) [java-util-0.27.10.jar:?]

at io.druid.query.ChainedExecutionQueryRunner$1$1$1.call(ChainedExecutionQueryRunner.java:129) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.ChainedExecutionQueryRunner$1$1$1.call(ChainedExecutionQueryRunner.java:119) [druid-processing-0.9.2.jar:0.9.2]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_91]

at io.druid.query.PrioritizedListenableFutureTask.run(PrioritizedExecutorService.java:271) [druid-processing-0.9.2.jar:0.9.2]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]

Caused by: java.util.concurrent.CancellationException: Cancelled

… 21 more

2016-12-29T07:38:57,121 INFO [timeBoundary_CANCEL_[2016-12-04T00:00:00.000Z/2016-12-05T00:00:00.000Z]] io.druid.segment.CompressedPools - Allocating new littleEndByteBuf[1]

2016-12-29T07:39:07,932 WARN [segmentMetadata_CANCEL_[2016-11-27T00:34:05.000Z/2016-11-28T00:00:00.000Z]] io.druid.client.cache.MemcachedCache - Exception pulling item from cache

java.util.concurrent.ExecutionException: java.util.concurrent.CancellationException: Cancelled

at net.spy.memcached.internal.OperationFuture.get(OperationFuture.java:177) ~[spymemcached-2.11.7.jar:2.11.7]

at net.spy.memcached.internal.GetFuture.get(GetFuture.java:69) ~[spymemcached-2.11.7.jar:2.11.7]

at io.druid.client.cache.MemcachedCache.get(MemcachedCache.java:453) [druid-server-0.9.2.jar:0.9.2]

at io.druid.client.CachingQueryRunner.run(CachingQueryRunner.java:114) [druid-server-0.9.2.jar:0.9.2]

at io.druid.query.BySegmentQueryRunner.run(BySegmentQueryRunner.java:70) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.MetricsEmittingQueryRunner$1.accumulate(MetricsEmittingQueryRunner.java:104) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner$2$1.call(SpecificSegmentQueryRunner.java:87) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner.doNamed(SpecificSegmentQueryRunner.java:171) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner.access$400(SpecificSegmentQueryRunner.java:41) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner$2.doItNamed(SpecificSegmentQueryRunner.java:162) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner$2.accumulate(SpecificSegmentQueryRunner.java:80) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.CPUTimeMetricQueryRunner$1.accumulate(CPUTimeMetricQueryRunner.java:81) [druid-processing-0.9.2.jar:0.9.2]

at com.metamx.common.guava.Sequences$1.accumulate(Sequences.java:90) [java-util-0.27.10.jar:?]

at com.metamx.common.guava.Sequences.toList(Sequences.java:113) [java-util-0.27.10.jar:?]

at io.druid.query.metadata.SegmentMetadataQueryRunnerFactory$2$1$1.call(SegmentMetadataQueryRunnerFactory.java:213) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.metadata.SegmentMetadataQueryRunnerFactory$2$1$1.call(SegmentMetadataQueryRunnerFactory.java:208) [druid-processing-0.9.2.jar:0.9.2]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_91]

at io.druid.query.PrioritizedListenableFutureTask.run(PrioritizedExecutorService.java:271) [druid-processing-0.9.2.jar:0.9.2]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]

Caused by: java.util.concurrent.CancellationException: Cancelled

… 21 more

2016-12-29T07:39:08,437 WARN [segmentMetadata_CANCEL_[2016-11-28T00:00:00.000Z/2016-11-29T00:00:00.000Z]] io.druid.client.cache.MemcachedCache - Exception pulling item from cache

java.util.concurrent.ExecutionException: java.util.concurrent.CancellationException: Cancelled

at net.spy.memcached.internal.OperationFuture.get(OperationFuture.java:177) ~[spymemcached-2.11.7.jar:2.11.7]

at net.spy.memcached.internal.GetFuture.get(GetFuture.java:69) ~[spymemcached-2.11.7.jar:2.11.7]

at io.druid.client.cache.MemcachedCache.get(MemcachedCache.java:453) [druid-server-0.9.2.jar:0.9.2]

at io.druid.client.CachingQueryRunner.run(CachingQueryRunner.java:114) [druid-server-0.9.2.jar:0.9.2]

at io.druid.query.BySegmentQueryRunner.run(BySegmentQueryRunner.java:70) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.MetricsEmittingQueryRunner$1.accumulate(MetricsEmittingQueryRunner.java:104) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner$2$1.call(SpecificSegmentQueryRunner.java:87) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner.doNamed(SpecificSegmentQueryRunner.java:171) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner.access$400(SpecificSegmentQueryRunner.java:41) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner$2.doItNamed(SpecificSegmentQueryRunner.java:162) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.spec.SpecificSegmentQueryRunner$2.accumulate(SpecificSegmentQueryRunner.java:80) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.CPUTimeMetricQueryRunner$1.accumulate(CPUTimeMetricQueryRunner.java:81) [druid-processing-0.9.2.jar:0.9.2]

at com.metamx.common.guava.Sequences$1.accumulate(Sequences.java:90) [java-util-0.27.10.jar:?]

at com.metamx.common.guava.Sequences.toList(Sequences.java:113) [java-util-0.27.10.jar:?]

at io.druid.query.metadata.SegmentMetadataQueryRunnerFactory$2$1$1.call(SegmentMetadataQueryRunnerFactory.java:213) [druid-processing-0.9.2.jar:0.9.2]

at io.druid.query.metadata.SegmentMetadataQueryRunnerFactory$2$1$1.call(SegmentMetadataQueryRunnerFactory.java:208) [druid-processing-0.9.2.jar:0.9.2]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_91]

at io.druid.query.PrioritizedListenableFutureTask.run(PrioritizedExecutorService.java:271) [druid-processing-0.9.2.jar:0.9.2]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]

Caused by: java.util.concurrent.CancellationException: Cancelled

… 21 more

Common Config :

druid.cache.type=memcached

druid.cache.hosts=10.226.76.151:11211,10.226.76.152:11211

druid.cache.expiration=2592000

druid.cache.memcachedPrefix=d1

druid.cache.maxObjectSize=52428800

druid.cache.readBufferSize=10485760

Broker Config

druid.broker.cache.useCache=true

druid.broker.cache.unCacheable=

druid.broker.cache.type=memcached

Historical Config

druid.historical.cache.useCache=true

druid.historical.cache.populateCache=true

druid.historical.cache.unCacheable=

druid.historical.cache.type=memcached

Also we have enabled graphite to see metrics but memcahced metric is not there.

Looks like this problem was because of connection down with memcache.
Now verified everything is working fine.