multi value dimension bug or feature?

We have a datasource that has one multi-value dimension named “Advertiser Domain”, it contains lists of domain names.
When
I group on this dimension and also filter on one domain, I get a list of all domains back that co-ocurred with the one selected via the filter.
(See screenshot below)

So far so good, but when I look at the measures, I can see that the totals are less than individual entries although there exist no negative values that could cause this behaviour (I reversed the sort order to assert the absence of any negative values).
In the screenshot I attached, the domain “verizon.com” has a “Served” count of 78.8k but the total number of “Served” count is only 80.43k.

I do understand that the individual measures overlap but how can an individual number for a single entry in a multi-value dimension be higher than the total sum.
To make sure that this behaviour is not caused by a bug in Pivot, I set it to debug mode and looked at the native Druid queries Pivot generates. I can see that both the timeseries query submitted for the totals and the topn query submitted for the individual breakdowns contain the exact same filter expression.

Is this a bug or am I misunderstanding how multi-value dimensions work?

thanks
Sascha

Hey Sascha,

Is it possible that one event can have “verizon.com” listed in the “advertiser domain” field multiple times? If you aren’t de-duping those then I believe they would get counted multiple times.

thanks for the hint.

I’ll check that although I currently don’t understand how this would explain the numbers in the screenshot: the measures for the verizon.com entry could very well be overcounted, but then the total value would be overcounted also, right?

It depends on what the queries are doing exactly – whether they are counting values or rows. It’s possible that the split-out query is counting values and the totals query is counting rows.

Actually I should say aggregating on, not counting.