Re: [druid-user] Re: Aggregation on realtime updating order data

You can do it using WITH, as Marián suggested. Eg, for your example, I think this could work:

WITH maxOrderTime (orderId, “timestamp”) AS
(
SELECT orderId, max(“timestamp”) FROM order
WHERE “timestamp” < TIMESTAMPADD(DAY, -2, CURRENT_TIMESTAMP)
AND status = “Initiated”
GROUP BY orderId
)
SELECT order.orderId, order.qty, order.paymentId FROM order
JOIN maxOrderTime
ON order.orderId = maxOrderTime.orderId
AND order.“timestamp” = maxOrderTime.“timestamp”

I haven’t tried it, and I’m guessing a bit based on what you wrote. I don’t know how well it will perform.
Also, if you have more than 100k results in the end, you might get a “ran out of resources” (for a subquery) error, iirc.

But, worth a try!

I also note that this might be simpler in a native druid query, if you’re good with those, where you could use aggregators
like stringLast, floatLast, or doubleLast.

Hope that helps!

Will LATEST work for you?

I think you might have an X-Event-style problem? I tried to lay it out in the post below…
https://pmio.substack.com/p/x-events

If so, maybe we should start a dedicated thread in Druid Forum to get some thoughts from others?
https://www.druidforum.org