Hi,
I am a new user to Druid. I was using the quantilesDoubleSketchToHistogram query.
I think the result being delivered is incorrect. Requesting for further light on this topic.
MY QUERY :
{
“queryType”: “groupBy”,
“dataSource”: “sample1”,
“granularity”: “hour”,
“dimensions”: [
{“type”: “default”, “dimension”: “appid”, “outputName”: “application_id”}
],
“aggregations”: [
{
“type” : “quantilesDoublesSketch”,
“name” : “count_appid”,
“fieldName” : “appid”
}
],
“postAggregations”: [
{
“type” : “quantilesDoublesSketchToHistogram”,
“name” : “histogram_count_appid”,
“field” : { “type” : “fieldAccess”, “fieldName” : “count_appid”},
“splitPoints” : [1000.0,2000.0]
}
],
“intervals”: [“2020-02-07T00:00:00.000Z/2020-02-07T23:59:00.000Z”]
}
MY RESULT :
timestamp : 2020-02-07T06:00:00.000Z
histogram_count_appid : 400
count_appid : 4
application_id : 1044534198
timestamp : 2020-02-07T06:00:00.000Z
histogram_count_appid : 200
count_appid : 2
application_id : 1057889290
and so on…
I have taken reference of the this chat
URL : https://github.com/apache/druid/issues/6853
"Also i calculated the quantiles of [0.50, 0.75, 0.90, 0.95] and the histograms of [ 0.0, 200.0, 400.0, 600.0, 800.0, 1000.0, 1200.0, 1400.0, “Infinity” ] by myself. They were [100, 1150, 1772, 1886] and [ 6.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 3.0 ].
Compared the actual result with the query result, i found the quantile query of approximate histogram was more accurate than quantiles
sketch, but for the histogram query, quantiles sketch was win.
Can you tell me more about why the the quantile query of approximate histogram was more accurate thanquantiles sketch?"
My results differ from the above chat. Is there something I am missing to get the correct answer?