Error when updating Timezone

Hello,

In order to have daily data aligned with a specific TZ, we 've updated the granularity spec (move for UTC to Indian/Reunion):

    "granularitySpec": {
      "type": "uniform",
      "segmentGranularity": {
        "type": "period",
        "period": "P1D",
        "timeZone": "Indian/Reunion",
        "origin": null
      },
      "queryGranularity": {
        "type": "period",
        "period": "P1D",
        "timeZone": "Indian/Reunion",
        "origin": null
      },
      "rollup": true,
      "intervals": []
    },

but we have some errors during ingestion like :
org.apache.druid.indexing.common.actions.SegmentAllocateAction - The interval of existing segment[ms_attach_day_2022-08-18T00:00:00.000Z_2022-08-19T00:00:00.000Z_2022-08-18T16:26:30.708Z] doesn't contain rowInterval[2022-08-17T20:00:00.000Z/2022-08-18T20:00:00.000Z]

I understand that is because it tries to append data in existing segments that have old format (UTC)… but how to solve that and allow to create new segments ?

Thanks

Can you please share your ioConfig? I’m wondering if this has something to do with appendToExisting?

Hi, thank you for your answer, is this appendToExisting has sense for Kafka stream ingestion (what we use) ? Documentation I red on that was only for Batch ingestion ?

I haven’t seen anyone do this before, but it would seem to create a conflict between existing time intervals (segment id) and new ones.

Just an idea, perhaps you can reingest the existing datasource into a new one and transform the timestamps into the correct timezone for old data then continue with your real-time ingestion the new timezone.

Hello Sergio,

Thanks for your answer, what is the fastest way to do that ? Because It represents huge amount of data and I don’t want to use too much resources to reingest…

Hey @OliveBZH, sorry for the late reply. A parallel index job would likely be the fastest. If you are concerned about resource consumption you can do this by a time interval at a time, perhaps a month at a time. You can also control the number of tasks used for the ingestion job to control how many worker slots are used.

Hope this helps.