Group by a specific granularity.

Hi all!

I am trying to group by a specific granularity with the time format extraction function but it doesn’t work as expected. Here is my query:

{
“queryType”:“groupBy”,
“dataSource”:{
“type”:“table”,
“name”:“ad_events”
},
“intervals”:{
“type”:“LegacySegmentSpec”,
“intervals”:[
“2015-12-08T00:00:00.000-08:00/2015-12-10T00:00:00.000-08:00”
]
},
“filter”:null,
“granularity”:{
“type”:“all”
},
“dimensions”:[
{
“type”:“extraction”,
“dimension”:"__time",
“outputName”:“hour”,
“extractionFn”:{
“type”:“time”,
“timeFormat”:“yyyy-MM-dd’T’HH:mm:ssZ”,
“resultFormat”:“HH”
}
}
],
“aggregations”:[
{
“type”:“longSum”,
“name”:“sum_clicks”,
“fieldName”:“clicks”
},
{
“type”:“longSum”,
“name”:“sum_views”,
“fieldName”:“views”
},
{
“type”:“doubleSum”,
“name”:“sum_overage_gross_revenue”,
“fieldName”:“overage_gross_revenue”
},
{
“type”:“doubleSum”,
“name”:“sum_overage_publisher_revenue”,
“fieldName”:“overage_publisher_revenue”
},
{
“type”:“longSum”,
“name”:“sum_overage_views”,
“fieldName”:“overage_views”
},
{
“type”:“longSum”,
“name”:“sum_overage_clicks”,
“fieldName”:“overage_clicks”
},
{
“type”:“doubleSum”,
“name”:“sum_publisher_revenue”,
“fieldName”:“publisher_revenue”
},
{
“type”:“doubleSum”,
“name”:“sum_gross_revenue”,
“fieldName”:“gross_revenue”
}
],
“postAggregations”:[

],
“having”:null,
“limitSpec”:{
“type”:“NoopLimitSpec”
},
“context”:null
}

Basically what I want to do is grouping by hour of day. On my query I am querying two days of data, I should get only 24 records back, one for each hour of the day. However I get 48 which lets me think it’s not grouping by hour of day but just bucketing my data by hour. Is it possible to achieve that with Druid? Is my query wrong?

I am using Druid 0.8.1.

Thanks for your help!

Guillaume

Torche, where are you getting your syntax from?
http://druid.io/docs/latest/querying/dimensionspecs.html

Thanks Fangjin,

I guess I was looking to an outdated doc. Here is the final query that works great:

{
“queryType”:“groupBy”,
“dataSource”:{
“type”:“table”,
“name”:“ad_events”
},
“intervals”:{
“type”:“LegacySegmentSpec”,
“intervals”:[
“2016-01-01T00:00:00.000-08:00/2016-01-03T00:00:00.000-08:00”
]
},
“filter”:null,
“granularity”:{
“type”:“all”
},
“dimensions”:[
{
“type”:“extraction”,
“dimension”:"__time",
“outputName”:“hourOfDay”,
“extractionFn”:{
“type”:“timeFormat”,
“format”:“HH”,
“timeZone” : “America/Los_Angeles”
}
}
],
“aggregations”:[
{
“type”:“longSum”,
“name”:“sum_clicks”,
“fieldName”:“clicks”
},
{
“type”:“longSum”,
“name”:“sum_views”,
“fieldName”:“views”
},
{
“type”:“doubleSum”,
“name”:“sum_overage_gross_revenue”,
“fieldName”:“overage_gross_revenue”
},
{
“type”:“doubleSum”,
“name”:“sum_overage_publisher_revenue”,
“fieldName”:“overage_publisher_revenue”
},
{
“type”:“longSum”,
“name”:“sum_overage_views”,
“fieldName”:“overage_views”
},
{
“type”:“longSum”,
“name”:“sum_overage_clicks”,
“fieldName”:“overage_clicks”
},
{
“type”:“doubleSum”,
“name”:“sum_publisher_revenue”,
“fieldName”:“publisher_revenue”
},
{
“type”:“doubleSum”,
“name”:“sum_gross_revenue”,
“fieldName”:“gross_revenue”
}
],
“postAggregations”:[

],
“having”:null,
“limitSpec”:{
“type”:“NoopLimitSpec”
},
“context”:null
}

``