Deleting segment by Drop rules - Druid 0.8.3

Hi All,

I am right now working on dropping the data older than 1 hour from the cluster. I have set the rule as below:

[{“period”:“PT1H”,“tieredReplicants”:{"_default_tier":1},“type”:“loadByPeriod”},{“type”:“dropForever”}]

According to this rule, only past 1 hour data should be taken into consideration for any druid query. The problem is I can see that, through this rule ‘used’ field in segments table is set to 0 but I can still query the old data.

Can you tell me the possible causes for this kind of behavior? I also disabled the cache and tried with time series, select queries but it is still returning old data too.

Your quick reply would be a great help.

Thanks,
Jvalant

Hi Jvalant,

the coordinator rules applies to segments after they are handed off to the historical nodes and are run periodically,

If your segments are in realtime nodes coordinator rules wont get applied to them.

If you want to query data for prev 1 hour, you can specify the interval in your query.

Any data older than 1 hour, will eventually get dropped when the coordinator rules run.

Hi Nishant,

I am still confused. My segments are in realtime node, so what is the way to drop data older than 1 hour from historical node? I dont want to specify in the query, but my requirement is to drop the data from node.

With the below rule,

[{“period”:“PT1H”,“tieredReplicants”:{"_default_tier":1},“type”:“loadByPeriod”},{“type”:“dropForever”}]

It won’t get into consideration immediately? I can see “0” in used field in segments table, but its data is still populated by queries. Can you brief me on this?

Looking forward for your reply.

Thanks,
Jvalant

Jvalant, rules only apply to historical nodes. You need handoff to occur before you can set the rules.

Please try to follow http://imply.io/docs/latest/tutorial-kafka-indexing-service.html for setting up a data pipeline from Kafka. We strongly discourage realtime nodes from being used post 0.9.0

Hi Fangjin,

Thanks for the reply.

How I can modify the handoff related time? I have to configure any property in realtime or coordinator node ? Please suggest the possible modification.

Thanks,
Jvalant

Please keep this discussion in one topic. You’ve been asking the same question over and over in many different topics. Let’s use https://groups.google.com/forum/#!topic/druid-user/---lAda12sc