Druid Load & Drop rules

Hi there,
I’m having a hard time to decode the Druid load & drop rules. Lets say I’ve the following rules,

{
  "type" : "loadByPeriod",
  "period" : "P1W",
  "tieredReplicants": {
      "hot": 1,
      "_default_tier" : 1
  }
}

and

{
  "type" : "dropByPeriod",
  "period" : "P1M"
}

[load by 1W and drop by 1M]

So my understanding is, Druid will load all the segments that are 1W load into SSD cache location, and drop segments that are one month old from the cluster (which means Druid broker can't query those segments). The segments that are less than a month old still be there on deep storage and historical node will pull it from S3 if the query needs it.

Is my understanding is correct?

``

Hi,

Segments are matched against the rules in order and the first rule that matches is applied.

With the above rules, this is the expected behavior -

  1. Segments between now - 1W will match first rule and loaded on the historicals.

  2. Segments between 1W-1M old will match second rule and will be dropped from historicals. data for this period can’t be queried but segments will still be there in deep storage.

  3. Segments older than 1M will not match both these rules and as per the default rule, they will be loaded with replication 2 and available for queries.

If you want to drop any segments older than 1W instead, for this instead of using the dropByPeriod rule, use dropforever rule.

Thanks Nishant.