dsql issue!

I deploy Druid cluster on 5 nodes. one has middleManager, one has zookeeper, coordinator, overlord and router and remaing three have historical and broker on each.
How can I improve my druid performance either on ingest data or query on 100 million rows?

I want query result in less than 1 second, but it gives me in 60 seconds, my each node has 32gb ram and 4 cores.

Current config. of historical are

Xmx:16gb

, maxDirectmemory:24gb,

num.processing.thread=3,

murgebuffers=1

, buffersizeinbytes=1gb

, druid.cache.sizeInBytes=8mb

druid.query.groupBy.maxOnDiskStorage=0

druid.query.groupBy.maxMergingDictionarySize=1gb

druid.server.http.numThreads=60

and brokers are

druid.broker.http.numConnections=20

druid.broker.http.maxQueuedBytes=0

druid.processing.buffer.sizeBytes=1024000000

druid.processing.numMergeBuffers=1

druid.processing.numThreads=3

what can I do to get better performance that I want within millisecond.

Hi Umar,

I think looking at the Druid metrics will help identify where the bottle necks are and tune the cluster accordingly. You might have to beef up your existing servers and/or add more nodes as per the findings. Also going through this document https://github.com/apache/incubator-druid/blob/master/docs/content/operations/basic-cluster-tuning.md might help you tune the cluster.

Thanks,

Sashi

What query are you running?

I have 105867563 rows of sales data, When I query i.e (select count (1) from (select distinct column1, column2, column3, column4 column5 from SALES where saledate between timestamp 18-01-01 00:00:00 and 18-12-31 00:00:00)); it takes 60 to 70 seconds to give the query result.
where column1…column5 are dimensions and saledate is timestamp.

What can do for getting result within milliseconds?