Druid datasource/segments deletion

Hi,

First of all, I would like to thank this community for helping me to setup the cluster. Now my cluster is up and running.

I have couple of issues. Please help me to resolve the issues.

Issue 1:

I ingested around 7 days (assume 1st Oct to 7th Oct) of data into the system with datasource name test_druid. I don’t want to keep same 7 days for a long time. I want to add one-day data and delete one-day data from same datasource. There are no APIs to delete this data. Please let me know how to delete the data from the same datasource.

Issue 2:

Even I delete whole datasource using the API, it is taking the whole lot of time to delete the data. The moment I hit the API, it is marking ‘used’ flag as ‘0’ in druid_segments table. When delete process will be triggered for unused druid segments ? How to delete the segments/datasource in a faster way ?

Please help to resolve the issues and let me know if you need more details.

Thanks,

V Santhosh Kumar Tangudu

Hi V,

You’ll want to use load rules in combination with automatic kill tasks from the coordinator. By default Druid does not delete data, only drops it from the queryable set. See:

http://druid.io/docs/latest/operations/rule-configuration.html
http://druid.io/docs/latest/configuration/coordinator.html

The kill configurations are the druid.coordinator.kill.* ones.