What is the expected behavior for a queryGranularity larger than segmentGranularity?

Hi all,

What is the expected behavior for a queryGranularity larger than segmentGranularity?

I have the following granularitySpec in the traquility ingestion spec:


"granularitySpec": {

"type": "uniform",

"segmentGranularity": "hour",

"queryGranularity": "day"

}

I’m expecting segment files to be created by the hour while query results bucketed by day.

However, I’m getting both grouped by two hours.


drwxr-xr-x   - some-user supergroup          0 2017-02-22 02:16 some-path/segments/some-datasource/20170221T160000.000Z_20170221T170000.000Z

drwxr-xr-x   - some-user supergroup          0 2017-02-22 04:15 some-path/segments/some-datasource/20170221T180000.000Z_20170221T190000.000Z

drwxr-xr-x   - some-user supergroup          0 2017-02-22 06:15 some-path/segments/some-datasource/20170221T200000.000Z_20170221T210000.000Z

drwxr-xr-x   - some-user supergroup          0 2017-02-22 08:16 some-path/segments/some-datasource/20170221T220000.000Z_20170221T230000.000Z

drwxr-xr-x   - some-user supergroup          0 2017-02-22 10:17 some-path/segments/some-datasource/20170222T000000.000Z_20170222T010000.000Z


curl -X POST -H "Content-Type: application/json" -d '{

"queryType": "timeseries",

"dataSource": "some-datasource",

"intervals": "2017-02-21T02:01Z/2017-02-22T02:01Z",

"granularity": {

"type": "period",

"period": "PT1H",

"timeZone": "Etc/UTC"

},

"context": {

"timeout": 40000,

"skipEmptyBuckets": "true"

},

"aggregations": [

{

"name": "count",

"type": "doubleSum",

"fieldName": "count"

}

]

}' "http://localhost:8082/druid/v2/?pretty="

response =>

[

{

"timestamp": "2017-02-21T03:00:00.000Z",

"result": {

"count": 1048

}

},

{

"timestamp": "2017-02-21T05:00:00.000Z",

"result": {

"count": 279533

}

},

{

"timestamp": "2017-02-21T06:00:00.000Z",

"result": {

"count": 40097873

}

},

{

"timestamp": "2017-02-21T07:00:00.000Z",

"result": {

"count": 22879879

}

},

{

"timestamp": "2017-02-21T08:00:00.000Z",

"result": {

"count": 107820427

}

},

{

"timestamp": "2017-02-21T10:00:00.000Z",

"result": {

"count": 81998681

}

},

{

"timestamp": "2017-02-21T12:00:00.000Z",

"result": {

"count": 90604572

}

},

{

"timestamp": "2017-02-21T14:00:00.000Z",

"result": {

"count": 88780445

}

},

{

"timestamp": "2017-02-21T16:00:00.000Z",

"result": {

"count": 44453800

}

},

{

"timestamp": "2017-02-21T18:00:00.000Z",

"result": {

"count": 19382513

}

},

{

"timestamp": "2017-02-21T20:00:00.000Z",

"result": {

"count": 16479134

}

},

{

"timestamp": "2017-02-21T22:00:00.000Z",

"result": {

"count": 41170166

}

},

{

"timestamp": "2017-02-22T00:00:00.000Z",

"result": {

"count": 90336925

}

},

{

"timestamp": "2017-02-22T02:00:00.000Z",

"result": {

"count": 84515393

}

}

]

I’m not sure, that’s definitely a “weird” configuration and goes against the normal guideline that queryGranularity should be contained within segmentGranularity. It looks like your bucketing isn’t entirely every two hours, since there’s some odd numbered hours in there. It’s possible that the queryGranularity is getting “forced” to line up with the segment somewhere, and basically getting truncated to hour.