Question about historical nodes for query performance?

I just have a quick question before I have another question. I’m trying to tune the performance of my queries currently it’s taking around < 1s but sometimes it just times out too.

My question is, how does the query work if my query is for the last 3 days of data? Would the query get the last hour from Index nodes? and the rest from historical nodes? My queries are really fast when my internal is around 1hour but it gets slower when I expand the query to 3 days. Does that mean it’s taking sometimes to load the segments?

Thanks!

Hi Noppanit,

In general Druid only queries the segments necessary for the interval you ask for. In most realtime configs the last 1-2 hours is in realtime tasks and anything older than that is in historicals. So, probably you’re hitting historicals on those wider time range queries. Maybe try checking system stats on the historicals to see if their CPU usage is spiking up.

Thanks for your suggestion. I have a follow up question about that. We’re trying to tune the query to be as fast as possible. Currently it takes about 20s for a query with a span of one day of data. According to my Druid coordinator we ingest about 500GB of data. We turned the caching off on Broker and enable the caching on Historical nodes. We need some performance improve from 50s to 20s. I’m wondering if it can be any faster. What settings do you need to see so I can copy them here? If it will help fine tuning the performance.

Thanks a lot.