I'm new to Druid and was wondering about the roll up functionality to see if it would handle certain obvious and less obvious scenarios.
Would the following functions work as expected? MAX, MIN, AVG, COUNT, COUNT(DISTINCT), PERCENTILE, SUM
I assume the PERCENTILE would need all raw values.
The best place to start learning Druid would be the tutorial: http://druid.io/docs/latest/tutorials/index.html
And some theoretical background: http://druid.io/docs/latest/design/index.html
As for aggregate functions, these can be found here: http://druid.io/docs/latest/querying/aggregations.html
These aggregations are used in the main groupBy query type, as well as in other queries (timeseries, topN - these are sort of optimizations for groupBy is certain limitations).
More on queries: http://druid.io/docs/latest/querying/querying.html
Regarding approximation functions ( COUNT(DISTINCT) and in some sense PERCENTILE), I’d look into Data Sketches. It is a well-known approximate-results library and there is a core extension which integrates it into Druid: http://druid.io/docs/latest/development/extensions-core/datasketches-extension.html