Alternate query aggregation granularity vs filter(having) granularity

I have the following query in PostgreSQL:

SELECT SUM(revenue) as revenue, AVG(revenue) as revenue_avg,

TO_CHAR(a.report_date, ‘IYYY-IW’) as weekdate

FROM table a

INNER JOIN (

SELECT site_id, product_id

FROM table

WHERE site_id = 19

GROUP BY product_id, site_id

HAVING revenue > 200

) b ON a.site_id = b.site_id AND a.product_id = b.product_id

GROUP BY

TO_CHAR(a.report_date, ‘IYYY-IW’)

Is it possible to get a similar result in Druid? These are for our ‘trended’ queries, where a user might filter on a calculated column by product_id and we return a table of that data, plus a total of that data aggregated by week that is graph-able.

I don’t think it works in a nested subquery since report_date isn’t part of the base query’s dimensions since the base query would have the equivalent of ‘all’ granularity in Druid, but the top level query has a weekly granularity.

Hi, I plan to support this kind of query as part of PlyQL in the near future.

Here is the relevant issue if you want to follow it: https://github.com/implydata/plyql/issues/25

Otherwise you have to make two queries for now.

Thanks, Vadim. I was hoping this could be accomplished using Druid’s JSON syntax since out app is in PHP, however, I haven’t yet built out the components for it, so I could rethink how I’m approaching it. The MySQL gateway could be helpful there.

The only way I can see to do this now would be to run the inner query to get all the dimension combinations, then use that to build out a huge list of filters to apply to the outer query.

I don’t think this will be an effective approach as there can be up to 300000 products, and even more results if more dimensions are applied and we would need to fetch all of them to apply sorting, filtering and paging to the outer query.

Yeah my method would also be to fully materialize the intermediate list.