I am trying to drop data from the druid 0.8.3. I want to keep data for only past 1 day and want to drop everything older than 1 day. My realtime segments are configured for 1 hour segments.
I have set the drop rule in mysql db as below:
According to this rule, druid coordinator is marking the segments older than 1 day as unused in mysql database.
To drop the data from deep storage as well as the cluster, I created kill task and ran that for the specific interval. Kill task is deleting segments entries from the MySQL metadata as well as cleaning the data from the local deep storage. ( which was expected from kill task).
After running the kill task, I can still see the deleted segments in the coordinator admin console. Also, I’m able to query data older than 1 day even after the kill task removed the deep storage data and mysql metadata. Is it possible that the historical/realtime is still reatins old segments even after kill task?
For an experiment, I restarted the historical node, nothing changed. I’m still seeing the older segments. I then restarted the realtime node and I found that the dropped segments are gone from the cluster. So, it looks like raltime node is retaining data older than 1 hour.
So, I have a few questions:
Why realtime node is not dropping the segments older than 1 hour from the memory and making it available for the queries even after the segments are deleted by Kill task??
How to make realtime node drop segments from the memory as soon as they are handed off to historical node? Is there any configuration that I’m missing?
My datasource is configured for the realtime data ingestion.
There are no warning or error logs in any node’s log files ( realtime, broker, coordinator, historical )
Any help or guidance in this would be greatly appreciated.