windowPeriod vs segmentGranularity

Sometimes our Kafka queues are backed up and we may get messages that were created 40/50 mins ago.

I want to be able to set my windowPeriod = PT120M [2 hours] and segmentGranularity to 3 hours.

But it looks like 3 hours is not an accepted segmentGranularity.

How can I handle this scenario ?

Hi Jagadeesh,

The actual solution you want will come in Druid 0.9.1, where for Kafka-based ingestion, there will be no more windowPeriod and you can stream in any timestamp exactly-once. The RC should out in the coming weeks and you should look into using the KafkaIndexTask.

Right now the best solution is to either 1) Set up a lambda architecture or 2) Have a long windowPeriod as you mentioned.

– FJ

Hi Fangjin,
can you please drive me to some request/discussion about Kafka windowPeriod removal.

Thanks

Maurizio

Hey Maurizio,

These discussions might interest you:


https://groups.google.com/forum/#!msg/druid-development/kHgHTgqKFlQ/fXvtsNxWzlMJ

Docs on the upcoming Kafka indexing service: https://github.com/druid-io/druid/blob/master/docs/content/development/extensions-core/kafka-ingestion.md

If you have any specific questions about Kafka ingestion or window periods, you can ask them here as well.

Hey guys, KafkaIndexTask looks like a game changer. I’m about to start working with it, but quick question. Can it serve queries before they publish a segment just like the realtime nodes can?

Hi Drew, yes they can.

We’re working on a new tutorial about how to use them.

When will these changes be moved to Tranquility API, now that will be a game changer for sure.

Hey Jagadeesh,

We intend to make similar changes in Tranquility once the Kafka based version has been run through the paces a bit.