Reindex task + dimension less schema = metric lost

Hi team - it’s me again

We are running Kafka Indexing Service for real time ingestion, without specifying the dimensions (schema less). According to the documentation, the fields on what we are define our metrics are not stored also as dimensions :

  • The timestamp column and columns/fieldNames required by metrics are excluded by default.

As we are getting lots of segments, we are trying to set-up a reindex task, using IngestSegmentFirehose.

The problem is that we don’t have anymore in the current Druid data the initial fields used to calculate the metrics (as pointed before, they were not ingested as dimension) and the reindex tasks requires to specify the metrics (getting a NPE if not).

So, after the task finishes with success, all our metrics are lost.

Can you confirm this scenario? How can we overcome the situation, please?



Any idea about this, team? For now, we are using a hadoop job to reingest from S3, but we need to put in place a system to change the granularity of old data (to daily).
Thanks for your time,