Store and filter high cardinality dimension?

Hello,

I have a high cardinality dimension (IP). In my case almost every IP will be unique. Now I would like to filter on ip ranges (for example 10.0.0.1/24).

The most basic solution would be to store it as a normal dimension, but from what I understand this is not optimal for druid… Can I do something smart, for example using using a ThetaSketch? It would be ok for me if the filtering is not exact.
(I already have a ThetaSketch to count the number of unique IPs)

Thanks!
Victor

Theta sketch might work, but have you thought about storing the range it self as a dimension like “ip_base” / “net_mask” then use a business logic layer to express the desired ranges ?

Sorry, maybe I wasnt clear in the first message. The filtering needs to work dynamically so that I can pick any range to filter on when I do my druid query.

/Victor

I don’t think there’s native support for it yet, but such a filter on an integer dimension would be a nice reference implementation for numeric dimension handling. Would you mind filing a github issue describing what kind of query you want to do and how you intend to use the results?