Druid aggregation when filtering condition is MAX of a group


Given the below dataset, we are looking to get the SUM(MAX)) of a metric.

{“time”: “2015-09-01T00:00:00Z”, “url”: “/foo/bar”, “user”: “alice”, “latencyMs”: 32}

{“time”: “2015-09-01T01:00:00Z”, “url”: “/”, “user”: “bob”, “latencyMs”: 11}

{“time”: “2015-09-01T01:30:00Z”, “url”: “/foo/bar”, “user”: “bob”, “latencyMs”: 45}

{“time”: “2015-09-02T01:30:00Z”, “url”: “/foo/bar”, “user”: “bob”, “latencyMs”: 80}

{“time”: “2015-09-01T01:30:00Z”, “url”: “/foo/bar”, “user”: “alice”, “latencyMs”: 65}

Roughly, we are trying to find the Druid equivalent of the below SQL query

select SUM(m) from

(select url, user, MAX(latencyMS) as m from my_index

groupBy url, user ) q

Is it possible to do something like this in Druid?

If so, any pointers would be of great help.



Hi Sumanth,

Have you looked at https://github.com/implydata/plyql to use the SQL query directly?