I am ingesting data using Kafka Firehose API. I killed a Realtime node. I want to start the Realtime node such that it persists, merges and hands-off all the old data ( so that the basePersist directory is completely empty) but does not ingest any new data since I want to remove that machine from the cluster without losing any data. Is there a way to achieve this?
Hi Saksham, currently no, this goes against the design of realtime nodes, which try to recover once they restart or fail in some way. This thread might be interesting to you:https://groups.google.com/forum/#!searchin/druid-development/windowperiod/druid-development/kHgHTgqKFlQ/fXvtsNxWzlMJ
I understand that this goes against the design. But, what happens if we need to change/upgrade the hardware or retire some node permanently?
If you using realtime nodes, you can spin up a brand new node under a different consumer group and retire the old one.
You can run the old and new nodes in tandem for some time (depending on your segment granularity), and retire the old node after one handoff has occurred.