We have a bunch of records coming in through kafka. Let’s say we have a wrong value in exactly 1 record, 1 column/field. What are the steps required to make that change?
From what I read (almost every single page on druid.io), we need to:
prepare this row with corrected value as a input format (either as csv, or kafka message)
use the “Index Task” to load it in
But I don’t understand if we need to do the following
delete old data?
merge into the old segment?
query the old record first to find the old segment?
It’s quite difficult to understand the steps that needs to be done to update a record. If someone can list out the steps required, and how, that’d help a lot.
A related question: what if we are not trying to merge or overwrite an existing record, but a new record just happens to have the exact same timestamp as one of our older records? What would happen in druid? Would druid replace it, or simply add the record without problems?