Auto compaction in druid-0.15.0-incubating

Hi,

Does the auto compaction runs only when druid coordinator is restarted and not at regular intervals?

It appears that the auto compaction would be based on the size of input segments and will compact multiple segments with input size = “Input segment size bytes” into larger segment not exceeding Target compaction size bytes. The segment granularity will be ignored. Is this correct?

Is it possible to compact datasources such that each month’s data gets compacted in one segment (regardless of what granularity was used during ingestion)? (This can be done by submitting a compact task to coordinator but doesn’t seem possible with auto compaction)

Thanks,

Prathamesh

Prathamesh,

Auto Compaction runs forever once its submitted unless explicitly killed or when a new configuration overrides it.

In order for the auto compaction to run, it must meet either of the two parameters.

  1. Coordinator must be showing the total size for an interval less than inputSegmentSizeBytes
  2. Coordinator must be showing that the total segments per interval must be less than maxNumSegmentsToCompact

There are differences between the auto compaction vs regular compaction. The latter has all the parameters you can tweak where the former has a subset you can work on.

Rommel Garcia

Hi Roman,

  1. Does the compaction start immediately after it is submitted?

  2. How often does it run?

  3. When you say “total size of interval”, does that mean the time chunks (segmentGranularity) or arbitary interval?

4.If i have 6 segments (1 for each HOUR, segmentGranularity being HOUR) with size 1MB each, would the compaction run if inputSegmentSizeBytes=10MB? or is the inputSegmentSizeBytes limit on individual segment?

Thanks,

Prathamesh