Some issues with merging segments

We’ve had an interesting experience with merging segments.

We have about 3 weeks of data indexed via Tranquility at SIX_HOUR granularity. We wanted to try rolling those up into DAY granularity segments with merge tasks.

Since we use Tranquility, we are not able to use the coordinator’s automatic merging feature, since Tranquility always writes as linear shard spec. But since we only have a single shard in this datasource, we tried rewriting the shard spec to none and manually submitting merge tasks to the overlord.

The tasks do run, complete in about a minute, and publish a new segment with a full day’s data. So we merged the 3 weeks of data to day segments. The original segments were around 100-200MB each and the merged segments were around 500-600MB.

The problem we observed is that query latency increased alarmingly: historical nodes took 10-20x longer to return and CPU usage and LA went through the roof. We did not see any errors at all on historicals, only very high load. When we disabled the new merged segments, the behavior returned to normal.

Has anyone seen something like this, or have a theory as to why this would happen?

Hey Max,

I wonder if one of these things is happening:

  • Maybe post-merging you have few enough segments that you aren’t getting good parallelism on your historicals anymore (single segments always need to be scanned by a particular thread, so to fully parallelize, you need your number of segments to be more than your total number of processing threads).

  • Maybe the merged segments are different in some way from the un-merged segments; perhaps different aggregations, perhaps different compression or indexes, or perhaps different data altogether (do queries on the merged segments return the same results as pre-merge segments?).

If you think it might be the second thing, can you attach a copy of the merge task you submitted, and the descriptors of the pre-merge and post-merge segments? You can get the descriptors from the coordinator console, or the metadata store “payload” column, or descriptor.json on deep storage.​

Hi Gian

Thanks for your response.

I downloaded samples of the original and merged segments from deep storage, cracked them open, and inspected the binary index. I found that the bitmap type was changed from roaring to concise. This was the only difference. I believe that this happened since we were setting the bitmap type to roaring in the tranquility beam, but the default on the server was still concise.

As I mentioned in a previous thread, the kind of queries we are doing are very performant with roaring bitmaps, and practically unusable with concise. So this does make sense.

Thanks for the tips.

Regards,

Max

Ah ha, that’d explain it. You should be able to pass an “indexSpec” to the merge task as well to fix that.

Hi Max, how are you finding the stability of Roaring Bitmaps for your use cases? We are thinking about eventually making that the default as the benchmarks we’ve done definitely show it to be more performant.

We have nothing but good things to say about the roaring bitmaps. If you don’t do much filtering at query time, it doesn’t seem to make much difference. But we actually wouldn’t be able to lay out our data the way we like without the speed boost that they provide. I think most users would benefit from the change unless they are extremely sensitive to index size. We haven’t noticed any problems at all.

+1