Hi,
I have a Druid cluster set-up on AWS with 10 historicals (EC2 instance type i3.8xlarge) and 3 brokers (EC2 instance type is r5.12xlarge). Have a data source in Druid with around 3.5 TB data.
While running queries over the entire data, the queries are very slow (takes at times 5+ minutes to return the response). Segment caching is enabled on historical nodes.
For such scale of data, do I need to add more historical/broker nodes for faster query performance? What will be the optimal cluster configuration for this size of data?
Below is the Historical node configuration (derived as per the recommendation from Druid configuration documentation):
druid.service=druid/historical
druid.port=8083
# HTTP server threads
druid.server.http.numThreads=66
# Processing threads and buffers
druid.processing.buffer.sizeBytes=2147483647
druid.processing.numThreads=31
druid.processing.numMergeBuffers=8
# Segment storage
druid.historical.cache.useCache=true
druid.historical.cache.populateCache=true
druid.segmentCache.locations=[{"path":"{{location-to-segemnt-cache}}","maxSize":6000000000000}]
druid.server.maxSize=6000000000000
druid.query.groupBy.maxMergingDictionarySize=1000000000
druid.query.groupBy.maxOnDiskStorage=2000000000
Below is the Broker ** node configuration** (derived as per the recommendation from Druid configuration documentation):
druid.service=druid/broker
druid.port=8082
# HTTP server threads
druid.broker.http.numConnections=40
druid.server.http.numThreads=83
druid.broker.http.readTimeout=PT5M
# Processing threads and buffers
druid.processing.buffer.sizeBytes=2147483647
druid.processing.numThreads=47
druid.processing.numMergeBuffers=12
# Query result cache
druid.cache.type=caffeine
druid.broker.cache.useResultLevelCache=true
druid.broker.cache.populateResultLevelCache=true
druid.broker.cache.resultLevelCacheLimit=3145728
druid.broker.cache.unCacheable=[]
druid.cache.sizeInBytes=5368709120
# SQL properties
druid.sql.enable=true
druid.sql.http.enable=true
# Group by properties
druid.query.groupBy.defaultStrategy=v2
druid.query.groupBy.maxMergingDictionarySize=1000000000
druid.query.groupBy.maxOnDiskStorage=2000000000
``
Thank you in advance.
Regards,
Vinay