We would like to migrate metadata/data from a druid 0.15.1 to a new cluster running druid 0.22.1. Could you see any possible compatibility issues, e.g. due to data/metadata format differences ?
Great question. I like to think about compatibility in migrations in 3 areas: Ingest, Queries, and Segment (Meta)Data.
When preparing and testing something like this migration, I’d compare the release notes from the current version (0.15.1) to latest. I like to check this for both compatibility as well as new features (Vectorization etc) that might improve performance/decrease pain and cost.
On the query side, the major breaking change was deprecating select queries in favor of scan in 0.17. You’d have to rewrite any of those, but you should test other important queries as well.
Ingest has pretty good backward compatibility but I’ve found in many cases it’s worth revisiting ingest specs to take advantage of new features like inputFormat, or rewriting old spec so it comes up nicely in the dataloader UI (you can always submit json supervisors if the UI complains).
Finally, on the segment metadata side, the major change came in 0.15.0 when we stopped using 1) metadatadb 2) segment files AND 3) descriptor.json files in deepstorage, in favor of just the metadata db + segments in deep storage.
This guide from Imply (disclosure, my employer) has a few paths you might explore:
I might spin up a new cluster, ingest some data, test the queries, backfill the historical data, and once it’s tested flip the loadbalancer so your users barely notice the cutover except in better performance, although you could also plan a migration with downtime.
@pts I agree with the blue/green deployment approach. This is a very popular method of upgrading Druid.
Thank you so much for the reply.