Getting a single row in sql output instead of multiple

I have the following query on hive as well as Druid. Hive gives me the complete resultset, but Druid is only giving me 1 row.

Should the query be written in a different way?

SELECT
CASE
WHEN
SUM(CAST(hours AS BIGINT)) < APPROX_QUANTILE(hours_histogram, 0.33, 10000) FILTER(WHERE hours > 0)
THEN ‘LT’ END
FROM “test”

``

If you prepend "EXPLAIN PLAN FOR " to the query, what does it plan into?

Thanks,

Jon

Thanks for your quick reply. This is what i see in the explain…

DruidQueryRel(query = [{

“queryType”: “timeseries”,

“dataSource”: {

“type”: “table”,

“name”: “test”

},

“intervals”: {

“type”: “intervals”,

“intervals”: ["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]

},

“descending”: false,

“virtualColumns”: ,

“filter”: null,

“granularity”: {

“type”: “all”

},

“aggregations”: [{

“type”: “longSum”,

“name”: “a0”,

“fieldName”: null,

“expression”: “CAST(“hours”, ‘LONG’)”

}, {

“type”: “filtered”,

“aggregator”: {

“type”: “approxHistogramFold”,

“name”: “a1:agg”,

“fieldName”: “hours_histogram”,

“resolution”: 10000,

“numBuckets”: 7,

“lowerLimit”: “-Infinity”,

“upperLimit”: “Infinity”

},

“filter”: {

“type”: “bound”,

“dimension”: “hours”,

“lower”: “0”,

“upper”: null,

“lowerStrict”: true,

“upperStrict”: false,

“extractionFn”: null,

“ordering”: {

“type”: “numeric”

}

},

“name”: “a1:agg”

}],

“postAggregations”: [{

“type”: “quantile”,

“name”: “a1”,

“fieldName”: “a1:agg”,

“probability”: 0.33

}, {

“type”: “expression”,

“name”: “p0”,

“expression”: “case_searched((“a0” < “a1”),‘LT’,’’)”,

“ordering”: null

}],

“limit”: 2147483647,

“context”: {

“skipEmptyBuckets”: true,

“timeout”: 300000

}

}], signature = [{

p0: STRING

}])

``

I think you’re only getting one result because of the granularity:

 "granularity": {
    "type": "all"
  },

“all” will treat all data as belonging to a single time bucket.

If you add a GROUP BY (FLOOR _time to ) clause, I think that would give more results.

Thanks,

Jon

Thanks for the tip Jonathan. I will try that out.

I think Druid is doing the right thing here, FWIW. You have an aggregate function in your query and no GROUP BY, which means you should get a single ‘grand total’ result row back. I’m not sure why Hive would do something differently.

I think Druid is not doing the right thing here. Druid is trying to aggregate whereas i asked it to return me all the rows.
Maybe Druid is built to aggregate only and I should not be trying to get raw list out of Druid?

Your SELECT is composed only of aggregate functions (SUM and APPROX_QUANTILE) and you have no GROUP BY. That means you’re asking for an aggregate. What are you trying to do exactly? Maybe we can help you rewrite your query.

Gian