Understanding the Window period in Druid Realtime ingestion

While going through the Druid document , I read the following :

The windowPeriod is the slack time permitted for events. For example, a windowPeriod of ten minutes (the default) means that any events with a timestamp older than ten minutes in the past, or more than ten minutes in the future, will be dropped.

This was not very clear to me , hence I am hoping somebody would tell me if my understanding of this is correct :

Suppose Current Timestamp is : 14:50

Active Segment Interval in realtime node : 14:00 to 15:00

Window Period : 10 Minutes

Suppose now I get an Event whose time interval is 14:10 , will it be dropped or accepted ? My understanding was since the window period is 10 Mins , any events with timestamp from 14:00 to 15 :10 would be accepted ?

Or is it like since the current timestamp is 14:50 , only events between 14:40 and 15:00 will be accepted ?

Thank you so much in Advance.



Hi Vinay,
In the case you mentioned, only events that lies between the segment interval 14:00 to 15:00 would be accepted.

The realtime index task will wait until 15:10 for any lagging events for above interval, after which it will create an immutable segment and hand it over to the historical node.

e.g If the index task receives an event with timestamp 14:01 at 15:09 it is expected to be ingested and if it receives an event with timestamp 14:01 at 15:11, it will be considered out of window period.

Also, note that the new Kafka Indexing Service does have the ability to ingest delayed data and does not have the windowPeriod limitations.

See http://druid.io/docs/latest/development/extensions-core/kafka-ingestion.html.

Thank you so much Nishant for the great explanation . Also I will have a look at kafka indexing service as you suggested.