Backfilling dimension data on existing segements


We have a use case where we need to add new dimensions(example state,city) to our existing data source. After adding the new dimensions(state,city) to the kafka source and restarting the Kafka Indexing service, the new dimensions did automatically show up in the druid data source and we are able to query the data fine. The questions I have is for backfilling state, city dimensions for previously created segements, is hadoop batch ingestion the only option available or is there some other way I can backfill state,city data for previous segments without rebuilding/reindexing all the existing segments.



Hi Prem,

For adding another column to existing segments you will need to reIndex the data again and create new segments, no other option.

Thanks for the reply Nishant.

We took the approach of reindexing druid segements. We have implemented a way to parallelize the reindexing process.