MySQL metastore purging

I’m migrating my MySQL metastore and I have come to notice that there is a lot more data in there than I assumed there would be. I’m hoping that a lot of it can be purged and in the future I can set up a retention schedule for some tables. The two tables that I am most interested in purging are druid_audit and druid_tasklogs. They seem to permanently keep audit and log data that more often than not becomes dead weight after some amount of time.

Before I begin purging, I wanted to get some feedback about the metastore:

  1. Am I correct in thinking the druid_audit and druid_tasklogs tables have nothing to do with the functioning of the cluster. I.e. deleting all of the data in each of them will be harmless to cluster health, but will eliminate the ability to debug/troubleshoot issues.

  2. Are there any other tables in the metastore that are safe to purge on a continuous schedule, druid_tasks is another one that seems like it may have data from “forever” and something like all records > 14 days would be relatively safe to nuke.


Still looking for an answer to this question. I have ~20G of data between druid_audit, and druid_tasklogs that I would love to purge. I am fairly certain they are safe to delete, but I would prefer confirmation before pulling the trigger in production.

It’s safe to drop the rows in druid_audit and druid_tasklogs.