groupBy bucket per interval

Is it possible to get get groupBy aggregations bucketed by the “intervals” field in the query? i.e, I want one aggregation bucket per interval defined in the query.

Thanks!

You could do it by setting a “granularity” on your query.

Gian,
Thanks for your reply. The out of the box granularity options don’t seem to be helping when there are non-uniform interval sizes. More specifically, I have a dataset which doesn’t have data on the weekends and I want the weekends to be ignored by the bucketing rules. For example: I need a timeseries aggregation in 2 day buckets with one of the buckets spanning a Friday-weekend-Monday. .e.g: 2017-09-01 (Friday) to 2017-09-04 (Monday)

I tried using the following granularity (with a few variations of intervals shown further below )

“granularity”: {

“type”: “period”,

“period”: “P2D”,

“timeZone”: “UTC”,

“origin”: “2017-09-01T00:00:00”

}

I tried the below 3 intervals, all of which result in Druid returning 2 buckets with timestamps 2017-09-01T00:00:00.000Z and 2017-09-03T00:00:00.000Z and not a single bucket like I want.

option 1:

“intervals”: [

“2017-09-01/2017-09-05”

]

option 2:

“intervals”: [

“2017-09-01/2017-09-02”,

“2017-09-04/2017-09-05”

]

option 3:

“intervals”: [

“2017-09-01T00:00:00.000Z/2017-09-01T23:59:59.999Z”,

“2017-09-04T00:00:00.000Z/2017-09-04T23:59:59.999Z”

]

If granularity isn’t going to help me, is there any workaround to get this to work? Any help is appreciated!

I got such variable time interval buckets to work using a javascript extraction function on the __time dimension. I haven’t measured the performance penalty of using this approach though