Druid Roadmap (from the last meetup)

Hey guys, we presented our short, medium, and long term plans for Druid at our last meetup. The slides are below:

We may not get to everything in this list, but we hope to follow the outlined schedule. There are also some notes around the intended purpose of the proposed features.

– FJ

You can also use this thread to ask questions and also propose other roadmap items.

Thanks for sharing this. Information like this is great for people who couldn’t make it to the meetup.

– himanshu

the roadmap talks about ‘exactly once’ ingestion as long term plan.

However, it has been proven that exactly once delivery is simply not possible

The FLP paper states

Crucial to our proof is that processing is completely asynchronous; that is, we
make no assumptions about the relative speeds of processes or about the delay
time in delivering a message. We also assume that processes do not have access to
synchronized clocks, so algorithms based on time-outs, for example, cannot be
used. (In particular, the solutions in [6] are not applicable.) Finally, we do not
postulate the ability to detect the death of a process, so it is impossible for one
process to tell whether another has died (stopped entirely) or is just running very
slowly.

Additionally the blogger seems to be interpreting a much stricter version of Exactly Once than what I’ve heard Gian speak about with regards to ingestion… specifically:

The way we achieve exactly-once delivery in practice is by faking it. Either the messages themselves should be idempotent, meaning they can be applied more than once without adverse effects, or we remove the need for idempotency through deduplication. Ideally, our messages don’t require strict ordering and are commutative instead.

Exactly once in my eyes simply means that events coming in late, or resent from the source, or dropped due to internal service failure, all end up query-able (and only counted once in the query) using only a single stream-processing pipeline (no batch fixup).

I think a definition of “exactly once” that means simply “what is in Druid reflects your system of record” is both possible and useful. Whether or not your system of record reflects reality is your own problem :slight_smile: