Is it safe to truncate pendingSegments table?

I have few questions wrt to Druid’s pendingSegments table.

  • What exactly is this table used for?
  • In case if we want to manually truncate/purge the entries in the pendingSegments table, what are the precautions that we have to be aware of?
  • Can pendingSegments table be purged when real-time indexing (Kafka) tasks are running? Or should the supervisors be suspended before manually purging the entries?

Thanks.

Hi,
This table is used internally to allocate new segments IDs during ingestion. If a kafka or kinesis ingestion task encounters new data and doesn’t know where to put it, Druid puts it in this file temporarily, until it decides whether to write a new segment or add the data to an existing segment.
Please don’t delete this file, but you may truncate it. Anything older than a day or so is prob safe to truncate. Don’t truncate it while Kafka ingestion is running.

there’s an option to clean up old stuff automatically
druid.coordinator.kill.pendingSegments.on
see https://druid.apache.org/docs/latest/configuration/index.html

Thanks,
Matt