HLL query result changes for the same query

Hi,

We have a queries that use an hll metric and when we query multiple times the result is not the same

hll column definition:

`

{
“type”: “hyperUnique”,
“name”: “unique_users”,
“fieldName”: “user_id”,
“isInputHyperUnique”: false
}

`

sql query:

`

SELECT COUNT(DISTINCT unique_users) as unique_users
FROM druid.users
WHERE (((__time ) >= (TIME_PARSE(‘2020-06-08 00:00:00.000’, ‘yyyy-MM-dd HH:mm:ss.SSS’)) AND (__time ) < (TIME_PARSE(‘2020-07-05 00:00:00.000’, ‘yyyy-MM-dd HH:mm:ss.SSS’))))
LIMIT 500

`

result 1

`

[
{
“unique_users”: 46648761
}
]

`

result 2 1 second later

`

[
{
“unique_users”: 46692684
}
]

`

I know that hll is an approximate but is that normal that it change between 2 execution of the same query ?

It won’t necessarily return the same result every run, because the algorithm is order-dependent, and data isn’t fed into aggregators in a deterministic order. (It depends on the order that segments happen to get read in.) If all of the results are within an acceptable error bound then you shouldn’t worry about it.

thanks we will work with that and explain this behavior.