Some questions about distinctcount extension

hi guys,

we consider use druid-distinctcount extension to calculate exact distinct count in large dataset.But we notice some limitations on this implemention.

like " make sure queryGranularity is divided exactly by segmentGranularity" , if our data’s segmentGranularity is by day,data format like:

day,area_dimension,visitor_id

when we query multi days data use groupby like

{

“queryType”: “groupBy”,

“dataSource”: “sample_datasource”,

“dimensions”: “[area_dimension]”,

“granularity”: “all”,

“aggregations”: [

{

“type”: “distinctCount”,

“name”: “uv”,

“fieldName”: “visitor_id”

}

],

“intervals”: [(the interval cross multi day)

“2016-03-01T00:00:00/2016-03-10T23:59:59”

]

}

can this druid-distinctcount extension get correct result? how it process the diffrent day’s segment?

Thanks

no, the query granularity must be exactly same with segment granularity, the result is right, else is wrong ,your segmentGranularity is by day,the query granularity must be day.
在 2017年9月18日星期一 UTC+8下午3:35:09,GuangSheng Liu写道: