Hi Team,
We have a druid cluster with 10 Historical (i3.8xlarge) and 3 Broker (r5.12xlarge) nodes. When we are firing around 50-60 concurrent queries, the broker goes into hung state. We have to restart broker to get things running again.
Below are the historical and broker configuration details:
- Broker config
HTTP server threads
druid.broker.http.numConnections=100
druid.server.http.numThreads=50
druid.broker.http.readTimeout=PT5M
Processing threads and buffers
druid.processing.buffer.sizeBytes=2147483647
druid.processing.numThreads=60
Query cache
druid.broker.cache.useCache=true
druid.broker.cache.populateCache=true
druid.cache.type=local
druid.cache.sizeInBytes=2000000000
Query result cache
druid.broker.cache.useResultLevelCache=true
druid.broker.cache.populateResultLevelCache=true
druid.broker.cache.resultLevelCacheLimit=5242880
druid.broker.cache.unCacheable=
druid.sql.enable=true
druid.sql.http.enable=true
druid.processing.numMergeBuffers=20
druid.query.groupBy.defaultStrategy=v2
druid.query.groupBy.maxMergingDictionarySize=1000000000
druid.query.groupBy.maxOnDiskStorage=2000000000
``
2) ** Historical configuration**
HTTP server threads
druid.server.http.numThreads=50
Processing threads and buffers
druid.processing.buffer.sizeBytes=1073741824
druid.processing.numThreads=31
Segment storage
druid.historical.cache.useCache=true
druid.historical.cache.populateCache=true
druid.segmentCache.locations=[{“path”:"{path_to_cache}",“maxSize”:6000000000000}]
druid.server.maxSize=6000000000000
druid.query.groupBy.maxMergingDictionarySize=1000000000
druid.query.groupBy.maxOnDiskStorage=2000000000
Could someone please help me out with the below questions:
1) Right now, the cache is enabled on both broker/historical nodes. Will it make any difference if the cache is just maintained on the historical?
2) By switching to ‘caffeine’ cache, will there be any improvement?
3) How to use only the ‘query result’ cache on the broker node?
4) Are there any configuration tweaks that needs to be done to historical/broker nodes above, that will help to improve query performance?
Thank you in advance.
Cheers,
Vinay