Why do timeBoundary queries hit historical nodes?

Hi,

we have a Druid cluster with quite a few client dashboards pointed to it and each dashboard is sending several timeBoundary queries each minute. Although these queries are lightweight they swamp our cluster and make the metrics in our dashboard noisier than necessary.

When looking into this, I noticed that the timeBoundary queries not only hit the broker nodes but seem to get forwarded to the historicals.

As I didn’t explicitly exclude them from cache I wonder if they even end up in our caffeine cache, maybe even swamping it.

So my question is: why do timeBoundary queries get forwarded to the historicals at all? To my understanding, the broker has a timeline view and knows the min/max times itself. Wouldn’t it be possible for the broker to answer time timeBoundary queries autonomously?

thanks

This is a very good question. Can you describe what your use case for the timeBoundary query is, and what your expected behavior is?

our use case is that our dashboards send timeBoundary queries to learn about the time range for which Druid holds data. If new data arrives, the dashboard should learn about it.
The expected behaviour… hm … well I would expect the timeBoundary query to return the correct min/max time of course and was just wondering whether this requires that the query is passed to other nodes as it seemed to me that the broker already knows about the oldest and newest segments. That was just curiosity. Maybe there’s a good reason that I’m not seeing. Also, I wondered whether this is the expected behaviour or whether I’m seeing something abnormal.

To give another background: I was looking into how user queries distribute across historicals. We currently have connectionCount scheduling enabled on the broker in our production system and during a recent test I ran in our test system, I became aware of the fact that the comparator used to rank servers based on the connectionCount doesn’t deal with pareto optimality. (There was a similar issue with segment assignment to servers that got fixed recently.)

In our test environment with three historicals I had configured a load rule that was putting all segments on all historicals. I was sending a sequential stream of queries and saw that only a single historical was receiving all the queries while the other two were idling around.

In production we have one tier that has an in-tier replication factor of 2. In such a case, the effect shouldn’t be as drastic as in tests but it shoudl still be measurable and I was looking into the query distribution among the historicals. While doing this, I noticed that most of the queries hitting our historicals are timeBoundary queries.

If a segment has a segment granularity of 24 hours, but a query granularity of 1 minute, and only the first 585 minutes of the segment have any data, what is the expected return value?

If a segment has a segment granularity of 24 hours, but a query granularity of 1 minute, and only the first 585 minutes of the segment have any data, what is the expected return value?
13:37 ? :smiley: :smiley:

ah thanks, now I understand what you are hinting to. I wasn’t conscious about the need to differentiate between segment intervals and data intervals.

The broker has a timeline view but the times in there are only the segment intervals, not the data intervals. I saw that there’s a class DataSegment and a class Segment. The latter is the only one that has the data timerange in it and seems to be available only to historicals, right? I peeked into the metastore and also didn’t find the data intervals there.

So basically, the expected behaviour is to get the min/max data intervals returned from the timeBoundary query and this information isn’t available to the broker if I understand correctly.

Interesting. I imagine it might be beneficial to eventually expose more metadata about segment contents to the metastore and broker, like the summary statistics usually available in classical databases.

Now that the SQL support is there, my guess is that exposing more summary stats to the broker as a prerequisite for a query planner is something that some of you are probably already eyeing towards?

thanks for your support in guiding me to an understanding.

In theory the broker could have all the info it needs to serve unfiltered timeBoundary queries, but it doesn’t, since it doesn’t get column level summary stats. It might get them one day (like you said, it would help with query planning) and at that time it might be able to serve unfiltered timeBoundary queries on its own too.

Erik is also right, though, that if you have a filter on your timeBoundary the broker would generally still need to forward it to the data nodes.