We have a dimension which contains about 300,000 unique values. Then we have a count metric to count the number of events ingested per value.
When we run a TopN on this dimension and metric with a threshold of 100,000 there are values that should be in the top 100 (according to a groupBy we used to check) which are not returned at all by the TopN. Making the result very very inaccurate.
In practice, this means that if you ask for the top 1000 items ordered, the correctness of the first ~900 items will be 100%, and the ordering of the results after that is not guaranteed
Could this be a bug?
Or could this have to do with rollup and all the other dimensions we have (about 25, with a total rollup to 20% of the original event count)?