i created metricSpec like -
Here i want to add average for counter_value field also while ingestion ,
how to calculate average for a metric column while ingesting ?
So average is not a supported ingestion-time function. There are a number of them, in fact, including
The way I work around this (which is by design given that Druid is a parallel ingestion database) is to create the metrics that would allow for those kinds of calculation – which in this instance are a
SUM and then
COUNT – thus,
AVG. So at query time you
SUM the sum metric,
SUM the count metric, and then do a division.
Oh and if approximates are OK, there’s also the Quantiles Sketch
This is actually not a workaround. If you want to take the average of a column based on rolled up data, the only way to do it is to keep the sum and count as separate metrics and compute avg = sum/count at query time. Otherwise you would calculate the average of averages, which would be wrong. Let me give you an example:
You have hourly buckets
10-11h: 1000 rows, value=1 => sum=1000, count=1000, avg=1
11-12h: 1 row, value=2 => sum=2, count=1, avg=2
Total correct avg: 1002/1001=1.001
If you take the bucket averages: (1+2)/2=1.5 which is wrong
that’s why we don’t support averages during ingestion
Thank you for the clarity, @Hellmar_Becker !!! I’m afraid my brain wasn’t in clear writing mode…!!!
Do the quantiles sketches support averages? Or only quantiles?
The quantile using a probability/rank of
0.5 is the
median which is a type of average. If your use case allows for approximate
median instead of
mean, then the quantile sketch should work.