I have around 10 days of data in Kafka. I have set my ‘window period’ to 50H in Druid. I am running 3 realtime nodes. After running 3 realtime nodes for around 5 hours, I am still getting all the ‘events thrown away’ since they are outside the ‘window period’. How does Kafka Firehose decide the offset in Kafka? Does it use something like a binary search to get the Kafka offset within the ‘window period’?
continuing discussion here - https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/druid-user/oAHXFpjHxqA