[druid-user] Getting latest values (Druid 25)

Hi all,

I am ingesting telemetry information (Druid 25) which works just fine. Doing a query over a __time range also works fine and gives me what I require.

What I now need is to get the latest value ingested for a particular set of metrics. The set is user selectable based on supplied parameters from one to many metrics.

What I don’t want to do is fire off multiple queries in a loop. I have attempted to use LATEST() but this only works if I have a __time constraint (I don’t know what the constraint needs to be as the metrics could be minutes, hours, days behind, or never).

I have come up with the below CTE that I generate client side. Is this the best/most optimal approach? Will this scale if a need many metric elements?

WITH txAverage as (
SELECT __time, “value”, metric
FROM AxonProdTelegraf
WHERE source = ‘nodeX’ AND resource = ‘XCVR-1/6.4’ AND resource_index = 1 AND
metric_group = ‘groupY’ AND metric = ‘txAverage’
ORDER BY __time DESC
LIMIT 1
),
rxAverage as (
SELECT __time, “value”, metric
FROM “AxonProdTelegraf”
WHERE source = ‘nodeX’ AND resource = ‘XCVR-1/6.4’ AND resource_index = 1 AND
metric_group = ‘groupY’ AND metric = ‘rxAverage’
ORDER BY “__time” DESC
LIMIT 1
)
SELECT * FROM txAverage
UNION ALL
SELECT * FROM rxAverage

Hi richarde,

I’m not sure I understand what was not working.

here is a simple statement that grabs latest attributes within a group:

select channel, max(__time) max_time, Latest(__time,1000) last_time, latest(flags,1000) last_flags
from wikipedia
group by channel

Note the difference in how max() and Lates() work on the __time field.

If this is not what you were looking for, can you post the query that you ideally want and would not work?

Thanks. John

Hi John,

You have set me on the right path! Using this query gave me the result I was after:

SELECT LATEST(“value”, 20), metric
FROM AxonProdTelegraf
WHERE source = ‘nodeX’ AND resource = ‘XCVR-1/6.4’ AND resource_index = 1 AND
metric_group = ‘groupY’ AND metric IN(‘rxAverage’, ‘txAverage’)
GROUP BY source, resource, resource_index, metric_group, metric

Probably what threw me off was not understanding latest() second parameter if the column is a string. Not having ,20 results in a java exception:

Error: Unknown exception

java.util.concurrent.ExecutionException: org.apache.druid.java.util.common.UOE: Cannot make VectorValueSelector for column with class[org.apache.druid.segment.column.StringDictionaryEncodedColumn]

java.lang.RuntimeException

Yes, you have to add that second parameter to give the buffer some working space … I thought there was a default value but maybe not…