I am using apache druid 24.0.0 and ingesting data from aws s3 bucket using MSQ engine. As of now, I am running full load ingestion of csv files from s3 into druid using REPLACE INTO command and every time the table is recreated with new full load data.
But I am getting incremental feeds now in S3 and would like to know how I update the record in druid like the way we do in normal SQL so that after running incremental ingestion, druid has the updated record.
Are you trying to append additional rows or update existing ones?
Some records i need to append and some records i need to update.
Please see Upserts and Data Deduplication with Druid - Imply and let me know if you have any additional questions.
Thank you @Anil_Gupta
Also, can I use OVERWRITE clause to update records?
You can use OVERWRITE to replace records from specific timeframes. All the records in the timeframe would be replaced. It’s not a single row update, but it might address your needs
For instance, you can use the same data source as the source and destination of your replace query and use filters and transformations to modify the data.