We are migrating Druid to a new cluster and also changing the underlying storage from HDFS to S3. We have around a month of data in the old cluster and don’t want to lose that. What is the best way to move all the data and update the metadata storage accordingly?
It should be enough to copy the files from HDFS to S3 and then copy the rows from the old metadata store’s segment table to the new one. You’ll also have to edit the “loadSpec” of those payloads. You could do a test S3 indexing run to see what the new loadSpecs should look like.