Druid exactly once ingestion

I am using a druid sink with apache flink and using tranquility to send data to druid. How can I support exactly once semantics with a druid sink ? Druid does not support two phase commits. Is there any trick to support exactly once with checkpointing in flink ?

Thanks,

Vishwas

Hi Vishwas,

Druid provides exactly-once delivery guarantees when data is ingested from Apache Kafka using Kafka Indexing Service.

http://druid.io/docs/latest/development/extensions-core/kafka-ingestion

https://imply.io/post/exactly-once-streaming-ingestion

Thanks,

Sashi

How about flink , is there anything I can do on the flink side to guarantee exactly once. I have kafka as my flink source and the sink is druid, is there anything I can do to guarantee exactly once. Like replace an entire segment when there is drop over the wire or when the jvm crashes before checkpointing is complete.

Hey Vishwas,

In that scenario the simplest thing is to do kafka -> flink -> kafka -> druid.