Query Performance in Production cluster

Hi Guys,

I have deployed a druid production cluster using GCP with the follow specification:

4 Data nodes - 36 cpus + 136 GB ram + 500 SSD

1 Query node - 10 cpu + 36GB ram

1 Query node - 8 cpu + 20GB ram

We have 10 Billions of rows in a Druid datasource and We have created a dashboard using tableau and avatica jdbc connector.

The datasource has segments per day approximately (200mb - 400mb)

Our problem is the performance with the querys.

When I Filtered the dashboard per 1 year (365 days) the time for response is 90sec. and when I writed “top” in the linux console the CPU for all data node was in 90% used

Could you have any idea? or could you give any configuration pls.

Thanks

Jesús

What is your total dataset in TB for 1 year?

My dataset size 477.76 GB

What is the query you are executing and what execution plan is it following? Can you execute it from the Druid console to take the driver out of the equation?

Yes, I attached my query explain, the data node performance before / After the query

I tested this other query using druid console and the execution time is 34 sec.

And Currently, I’m only in the cluster.

after.PNG

queryexplain (2.87 KB)

Any GC activity?

Oh, and run vmstat during the query as well…

This is my jvm.config of Data node

-server

-Xms6g

-Xmx6g

-XX:MaxDirectMemorySize=39g

-XX:+ExitOnOutOfMemoryError

-Duser.timezone=UTC

-Dfile.encoding=UTF-8

-Djava.io.tmpdir=var/tmp

-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

Are there any STW pauses or anything in the logs?