We’ve had an interesting experience with merging segments.
We have about 3 weeks of data indexed via Tranquility at SIX_HOUR granularity. We wanted to try rolling those up into DAY granularity segments with merge tasks.
Since we use Tranquility, we are not able to use the coordinator’s automatic merging feature, since Tranquility always writes as linear shard spec. But since we only have a single shard in this datasource, we tried rewriting the shard spec to none and manually submitting merge tasks to the overlord.
The tasks do run, complete in about a minute, and publish a new segment with a full day’s data. So we merged the 3 weeks of data to day segments. The original segments were around 100-200MB each and the merged segments were around 500-600MB.
The problem we observed is that query latency increased alarmingly: historical nodes took 10-20x longer to return and CPU usage and LA went through the roof. We did not see any errors at all on historicals, only very high load. When we disabled the new merged segments, the behavior returned to normal.
Has anyone seen something like this, or have a theory as to why this would happen?