Single dimension ingestion is not working with incremental load

Hi,
We are using single dimension for few datasources which gets dropped before every ingestion. In one of the use case , where we need to ingest only the latest 1 month data. To meet this requirement we are using “appendToExisting” : true and will drop the last month segment alone instead of datasource before ingestion. But single dimension for this incremental loading is throwing below error.

Failed to submit task: Cannot construct instance of org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask, problem: Perfect rollup cannot be guaranteed when appending to existing dataSources a

When i disabled “forceGuaranteedRollup” with Single dimension ingestion spec,
i get below error.

Failed to submit task: Cannot construct instance of org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexTuningConfig, problem: DynamicPartitionsSpec must be used for best-effort rollup at [Source:

It sounds like Druid does not support single ingestion for incremental loading. I could not understand the rationale behind it. Still incremental load can move the rows to existing segments for perfect rollup. I am wondering is this row movement operation too much expensive? Is there any configuration to override this behaviour?

Thank you very much.

Jay

While I research this problem, can you give us a higher level view of what you are trying to do? Is it like a SCD, where each time the value changes, you add a new column, and in this case, it is monthly that the value changes?

It is not SCD actually. It is a dataset not a dimension. some of the recent data (last 2 months) in the dataset could be changed every day. Hence before ingesting we drop segments for last 2 months and append last months data suing apendExisting=True

Regards,
Jay

According to the tutorial:
https://druid.apache.org/docs/latest/tutorials/tutorial-update-data.html#append-to-the-data

The data will be grouped together at query time. Any reason to not want to use the DyanmicParitioning.

I am still not quite clear on exactly what type of situation you are looking to avoid by doing this.

One possible workaround would be to ingest with dynamic partitioning, then run a compaction using single-dimension, and forceGuaranteedRollup, if that works. To answer one of your original questions, on ingestion, new segments can be added, but existing segments aren’t re-written. Compaction will read and re-write the segments, and can do better rollups.

Also, some of this behavior may be different in recent versions - I don’t know which version you’re using. Eg, for Imply, I believe Imply-3.4 contains some improvements. I’m not sure whether they’re also in OSS druid, offhand, and don’t have all the details. Just mentioning in case you want to look into it.