I am building out a an object that will decide what query to use depending on the parameters passed into it.
My logic so far is:
If we want multiple dimensions or a having clause, use GroupBy
If we want a single dimension (no having clause), use TopN
However, after reading about TopN’s inaccuracies with data points that have a high cardinality, I am wondering whether or not I should use it.
Hence, I have some questions:
Assuming I want 100% accuracy for a single dimension, should I use GroupBy, or TopN with a very high threshold?
For multiple dimensions, would it be more efficient to use one groupBy or many TopNs (like it is done in Pivot)?
Thanks in advance!