Get aggregations within a date range outside the interval

Hi guys,

Due to comparison purposes is there a way to do so efficiently, instead of executing multiple queries against Druid?

Regards,
Shinesun

Right now probably no but it would be nice for https://github.com/implydata/plywood to support this natively

Hi Fangjin,

Thanks for the reply.

Yeah, that’d be great as already using Plywood on the client side to communicate with Druid.

However, is something like the following (nested GroupBy query) possible to issue against Druid right now in order to get the data from two different intervals and post-act accordingly?

{
“queryType”: “groupBy”,
“dataSource”: {
“type”: “query”,
“query”: {
“queryType”: “groupBy”,
“dataSource”: “nbaStats”,
“granularity”: “all”,
“dimensions”: [“player”],
“aggregations”: [
{
“type”: “longSum”,
“name”: “assistsPrev”,
“fieldName”: “assistsPrev”
}
],
“intervals”: [“2016-07-01T00:00:00.001Z/2016-07-31T00:00:00.001Z”]
}
},
“granularity”: “all”,
“dimensions”: [“player”],
“aggregations”: [
{
“type”: “longSum”,
“name”: “assists”,
“fieldName”: “assists”
}
],
“intervals”: [“2016-08-01T00:00:00.001Z/2016-08-25T00:00:00.001Z”],
“postAggregations”: [
{
“type”: “arithmetic”,
“fn”: “-”,
“fields”: [
{
“type”: “fieldAccess”,
“fieldName”: “assists”
},
{
“type”: “fieldAccess”,
“fieldName”: “assistsPrev”
}
],
“name”: “assistsDelta”
}
],
}

Also, are we gonna be able to issue nested TopN-s as well any time soon?

Regards,
Shinesun

That nested query shouldn’t return any data, since the outer query intervals don’t intersect the inner query intervals at all, and they would basically filter it all out. One thing you could try instead is having the “intervals” of the query cover both intervals you want to scan, and then using time-filtered aggregators to compute values for each individual time range, plus perhaps post aggregators to combine their results.

One caveat here is that this is not actually possible in 0.9.1.1. Time-filtered aggregators are possible in master and will be part of 0.9.2.