Historical node performance degrades over time

Over the course of a couple hours, I’m seeing the query times increasing dramatically.
Oddly enough, restarting the node brings performance back to where it should be.

Could this be a GC issue of some sort? I’m not sure why this would be happening.

Thanks,

Walt

Without other knowledge that is the first thing that came to mind when reading your statement. Do you monitor GC metrics for the node?

What’s metrics do you think are useful in this context? I have -XX:+PrintGCDetails enabled.
I also see a number of logs for jvm/pool, jvm/gc and jvm/bufferpool

jvm/gc/time

and

jvm/gc/count

Are the most interesting ones.

Turns out it was more an issue of thread-count than GC. Seems to working fine now.

Thanks,

Walt

We are facing a similar issue.
When running a test query against a freshly started cluster, I get an average query time of 5.5 seconds. In the course of a couple of hours the average query time increases to 6.5 seconds. It is only a slow and not so drastic increase, but it concerns me.

Turns out it was more an issue of thread-count than GC

Can you explain what issue you had and what you did to solve it?

One further question:

I don’t know whether the performance I get is in a normal range or whether it is really really bad:

The query I issue is a simply topn query with one filter and six metrics spanning one week of data. The data volume (segments on the historicals) is 50GB per day and we have 15 r3.8xlarge historicals serving the data. 200GB RAM per node for memory mapping, 31 processing threads on 32 cores. 7 segments per hour, segment size 300 MB, 10 mil records per segment.

Is a query time of 6 seconds a sensible performance?

thanks

That might be fine or it might be slow, it depends on which metrics and filters you are using. Some things to check for topNs specifically:

  1. Try increasing the size of your processing buffers, if they are too small then performance can suffer.

  2. If you have an expensive metric (HLL/sketch) that you are not sorting on, try setting “doAggregateTopNMetricFirst”: true in your query context. This defers computation of the non-sorting metrics until after the sort and limit. It involves doing another pass but it can be a net benefit.

Thanks a lot Gian. Getting support from you guys means a lot to me.
The query I was referring to does not have any expensive filters or dimensions. I only have a single filter set (no negations, conjuncts or disjuncts) which picks one customer name out of a pool of several hundreds. This filter involves a QTL lookup. So far we have seen no negative performance impact due to QTL dimensions. The six metrics are normal doubleSums but contain filter expressions on very low cardinality dimensions. I attached the query for reference. There are also two post aggregators that compute simple ratios.

We currently have the processing buffers on both the broker and the historicals set to 1GB. I saw that some broker configs have 2GBs set and we still need to test with that setting. On historicals I usually saw 500MBs.
Do you think 1GB could still be too small?

I’m wondering wether having 10 mil records per segment is too much, but the segment sizes are already on the lower side of what is recommended (211MB per shard, 9-10 shards per hour). If I try with 7mil or 5mil, the segments would end up being really small and there’s be way to many of them with regard to the 480 cores that we have.

We are here to help :slight_smile:

1GB should be fine for processing buffers, especially if you aren’t using any expensive metrics. QTL based filters will impose some overhead, but not overhead that scales with the number of metrics (unless those metrics are all filtered aggregators using a QTL based filter…).

Also make sure you’re using the most recent version of Druid. Almost every version includes performance improvements.

thanks again Gian, much appreciated.
I forgot to attach the query in the previous post.

I used Druid 0.9.0 for the tests. I did some early tests with Druid 0.9.1rc1 and noticed a roughly 20% performance boost, but I was just playing around, not doing serious testing. I guess the boost is due to the better distribution of segments across historicals.

We need to have 30 days of data in a hot tier. Currently, for one datasource that has 5 mil records per segment/shard, we get very good query latencies for short-term query ranges but if someone queries 30 days and doesn’t filter on anything, then the query would be rather slow 45 seconds or so. We now tested with 10 mil records per segment/shard which only yields 4-5 segments per hour. So far we only have a couple of hours of data with this setup and see that the short-term query times are twice as long as before. My hope is that this ratio will eventually flip and mean lower query latencies for the segments with 10mil records, but we cannot ingest data for a whole month just to find out whether this hypothesis will pan out to be true. So I find it kind of difficult to arrive at a good setup. I’ve spent months trying to tune Druid, but so far, every variation I tried only made Druid slower, not faster. Still waiting for the knot to untie itself and give way to a speed improvement. How likely is it that our very first setup happened to be the optimal one? :wink:

topnquery-six-metrics.txt (4.36 KB)