Doubts regarding Druid Configuration

Hi,

We have 6 historical nodes with 16 core,112 GB RAM, 500 GB Disk and 2 broker nodes with same config.

Currently this is the configuration.

Historical Node :

druid.server.maxSize=100000000000

druid.server.http.numThreads=100

druid.processing.numThreads=15

druid.processing.buffer.sizeBytes=1073741824

druid.query.groupBy.maxResults=1000000

druid.segmentCache.locations=[{“path”: “{{ segment_cache_location }}”, “maxSize”: 100000000000}

druid.segmentCache.numLoadingThreads=5

druid.segmentCache.numBootstrapThreads=2

druid.processing.numMergeBuffers=10

druid.server.http.maxIdleTime=PT3H

Broker Nodes :

druid.server.http.numThreads=200

druid.broker.http.numConnections=100

druid.processing.numMergeBuffers=4

druid.processing.numThreads=15

druid.server.http.maxIdleTime=PT3H

druid.broker.http.readTimeout=PT15M

druid.processing.buffer.sizeBytes=2000000000

druid.broker.retryPolicy.numTries=5

This config was working quite well but we saw couple of issue :

1- Historical nodes already filled by 90 %.

2- groupBy queries returned resource limit reached if there are huge amount of data returned by query.

We thought to increase below params :

druid.server.maxSize to 200 GB same for segment cache location.

druid.query.groupBy.maxIntermediateRows to 500000

druid.query.groupBy.maxResults to 5000000

But changing above params resulted in queries not getting responded in certain time and lots of query got queued up.

Am I missing something here ?

Another doubt is, I have created a dimension using extraction function in a query and wanted put filter one of the extracted value but simple select filter didn’t worked but having filter was working.

Hi,

Is there anything unusual with the above config ?