Hi,
I have enabled memcached and it seems to work “depending on the query”. For example: this query
select count(var)
from data_source;
I see the cache always being hit, as it should.
For this other query
select count(distinct(var))
from data_source;
The cache is never hit.
Why would this be? Below are my cache settings. Also if you any more general recommendations for cache settings, they are more than welcome.
This is my memcached command
memcached -m 1024 -d -P /usr/local/druid/memcached.pid -vv > /usr/local/druid/memcached.log 2>&1
Thank You.
common.runtime.properties
Query cache
druid.broker.cache.useCache=true
druid.broker.cache.populateCache=true
druid.cache.type = memcached
#2Gb
druid.cache.sizeInBytes=2000000000
druid.cache.maxObjectSize = 104857600
druid.cache.hosts = localhost:11211
druid.cache.numConnections = 16
druid.historical.cache.unCacheable =
I don’t see anything about the second query that should prevent it from being cached, unless perhaps you have approximate count distinct disabled and also have groupBy caching disabled (since exact count distincts are planned into groupBys). You could check the planning by adding “explain plan for” before your query.
One thing I do notice is that your caching configs do not quite make sense.
You have set useCache/populateCache on the broker, but not the historical, meaning that the historicals won’t be caching. But you have set unCacheable only on the historicals, meaning it won’t affect the brokers, where you are actually doing your caching. Furthermore, I want to point out that the default Druid configs do caching on historicals but not brokers, because that scales better as the number of historicals increases.
Hi, thanks for your reply.
Yes, approximate count distinct is disable as I need exact count, but this would apply to both queries where one is cached and the other isn’t.
GroupBy cache is enabled (druid.historical.cache.unCacheable = ). Is this what you meant?
I have put all cache properties in _common/common.runtime.properties as my understanding is that all properties in this file are used by all nodes, hence enabling cache with the same properties on the broker, historical and realtime nodes. Is this assumption wrong?
I don’t undertand why you assume I have put cache settings on the historical properties file, where this is not the case.
Thank you very much for your help.
Historical caching is controlled by these properties which aren’t set in your configuration, they default to false:
druid.historical.cache.useCache
druid.historical.cache.populateCache
So in your setup:
-
Caching is enabled on brokers
-
Caching is disabled on historicals
-
GroupBy caching is disabled on brokers because druid.historical.cache.unCacheable
is set to but druid.broker.cache.unCacheable
is not set and uses the default value which excludes GroupBy caching
Of course… you are right, oversight on parameters names… all working now.
Thanks for your time and patience.