I’ve read about strategies of how to order dimensions in the ingestion spec, but I never came across any mentions of whether the order in which the metrics are specified has any consequences.
We have a datasource with 35 dimensions and 30 metrics, an hourly granularity (segment and query) and about 6 segment partitions (hashed scheme) per hour at a size of an overall 3.5 GB per hour on the historicals. When I grab one of those segment files on S3 and extract it manually, I see only a single smoosh file, not several of them.
I wonder why there aren’t more smoosh files and whether it would make any difference if I grouped metrics in a certain order to make them adjacent in the smoosh files.
Lets assume that the metrics form logical categories like a bunch of metrics forming a customer funnel, another bunch relating to user statistics etc. Or lets say that some metrics belong together because they are being used in mathematical expressions like an average that needs to aggregate over two metrics. Would it have a positive effect if those metrics were arranged close to each other within the smoosh files?