Question on performance of RoaringBitmap vs. Concise

Our druid cluster that is supposed to serve ad-hoc queries over time ranges of 1 to 3 months of data isn’t fast enough.
I was wondering whether using RoaringBitmaps over Concise would increase the query latencies.
I couldn’t find much info on performance improvements.
I compared performance on one hour of data with hour query granularity but did not see a difference.

Is there any document comparing the two bitmaps in the context of Druid? I can find documentation outside of Druid that suggests that Roaring is faster/better but I’d need to know how things are inside Druid. What would be the pros and cons of using Roaring?

If Roaring encoding led to a moderate increase in data volume, that would be acceptable for us as our main focus is query latency

thanks
Sascha

Roaring is much faster for boolean operations (aka resolving filters). This means if you have intricate predicates in your filter OR are using regex filters, then roaring tends to pull way ahead of concise.

If you don’t ever use filters or only have a single statement in your filter, you might not see any improvement. This is partly because for very small and very simple filter statements, your aggregation compute time is going to outweigh the time it takes to resolve the filter.

cool! thanks so much. This is exactly the kind of info I need.
Is there any downside to switching to roaring performance wise? Is it slower than concise for other things? Or is it not mature enough for a production setup?

If not, I’d just switch over to using roaring.

thanks

Sascha

It has seen a lot of improvements in the last year. There are some cases where the on-disk size is larger for Roaring than Concise, so if you have nodes which are very disk size or disk-io bound this can make a difference.

In the real world “mature enough for a production setup” is a relative term. For us internally we simply have not had the engineer resources to run Roaring vs Concise through the full gauntlet of production release validation. Or more precisely, there have been other engineering tasks that are believed to have a greater impact and are therefore pursued instead.

If I were starting out a new cluster and evaluating various configurations anyways, I would definitely check it out.

Thanks a lot Charles for sharing these insights. Much appreciated

through the full gauntlet of production release validation
LOL, nicely put.

there have been other engineering tasks that are believed to have a greater impact and are therefore pursued instead.
like no-bid sampling, I assume ? :wink:

We recently switched over to roaring bitmaps. We have indexes with some very high cardinality dimensions, and perform complex queries with a few levels of and/or filters.

When using concise bitmaps, we observed that queries to the index service would slow down dramatically as the index filled up. This gave our latency a distasteful sawtooth shape that tracked our segment granularity. After switching to roaring bitmaps, this behavior went away entirely and query latencies increased across the board. We do see that segments are somewhat larger, but this is a good tradeoff for us.

As an aside, this wasn’t that easy to change, since Tranquility doesn’t respect this configuration from the json spec. We had to call into the scala beam builder from java, which required some fairly gross interop code. This seems like an oversight on the tranquility side.

Hey Max,

Great feedback on your setup.

Which version of Tranquility were you using? It’s supposed to respect all the tuningConfig settings. Some older versions didn’t do that, but newer ones should. If it doesn’t then that’s a bug.

We’re using Tranquility 0.7.4. When I added an indexSpec to the tuningConfig it didn’t work for me, but perhaps I did something wrong. We just modified the beam by calling into the builder, though this was somewhat tricky from the java side.

With 0.7.4+ it should work to include:

“tuningConfig” : {

“indexSpec” : { “bitmap” : { “type” : “roaring” } }

}

If you end up trying that and it doesn’t work, could you please file an issue here: https://github.com/druid-io/tranquility/issues/new

Thanks!