Memchached

Hi,

I have enabled memcached and it seems to work “depending on the query”. For example: this query

select count(var)

from data_source;

I see the cache always being hit, as it should.

For this other query

select count(distinct(var))

from data_source;

The cache is never hit.

Why would this be? Below are my cache settings. Also if you any more general recommendations for cache settings, they are more than welcome.

This is my memcached command

memcached -m 1024 -d -P /usr/local/druid/memcached.pid -vv > /usr/local/druid/memcached.log 2>&1

Thank You.

common.runtime.properties

Query cache

druid.broker.cache.useCache=true

druid.broker.cache.populateCache=true

druid.cache.type = memcached

#2Gb

druid.cache.sizeInBytes=2000000000

http://druid.io/docs/latest/configuration/caching.html

druid.cache.maxObjectSize = 104857600

druid.cache.hosts = localhost:11211

druid.cache.numConnections = 16

http://druid.io/docs/latest/configuration/historical.html

druid.historical.cache.unCacheable =

I don’t see anything about the second query that should prevent it from being cached, unless perhaps you have approximate count distinct disabled and also have groupBy caching disabled (since exact count distincts are planned into groupBys). You could check the planning by adding “explain plan for” before your query.

One thing I do notice is that your caching configs do not quite make sense.

You have set useCache/populateCache on the broker, but not the historical, meaning that the historicals won’t be caching. But you have set unCacheable only on the historicals, meaning it won’t affect the brokers, where you are actually doing your caching. Furthermore, I want to point out that the default Druid configs do caching on historicals but not brokers, because that scales better as the number of historicals increases.

Hi, thanks for your reply.

Yes, approximate count distinct is disable as I need exact count, but this would apply to both queries where one is cached and the other isn’t.

GroupBy cache is enabled (druid.historical.cache.unCacheable = ). Is this what you meant?

I have put all cache properties in _common/common.runtime.properties as my understanding is that all properties in this file are used by all nodes, hence enabling cache with the same properties on the broker, historical and realtime nodes. Is this assumption wrong?

I don’t undertand why you assume I have put cache settings on the historical properties file, where this is not the case.

Thank you very much for your help.

Historical caching is controlled by these properties which aren’t set in your configuration, they default to false:

druid.historical.cache.useCache

druid.historical.cache.populateCache

So in your setup:

  • Caching is enabled on brokers

  • Caching is disabled on historicals

  • GroupBy caching is disabled on brokers because druid.historical.cache.unCacheable is set to but druid.broker.cache.unCacheable is not set and uses the default value which excludes GroupBy caching

Of course… you are right, oversight on parameters names… all working now.

Thanks for your time and patience.