Asuming that I have DtaSchema with “cardinality” aggregator in metrics, which data will be stored in segment when I set byrow=true. I read about difference between byRow=true/false in terms of query result, but there is no information about how it works on indextime.
cardinality aggregators is to be used at query time only. you would use “hyperUnique” at ingestion time.
as you may see I need “byRow” cardinality to count distinct value of several fields not one. There is no way to use hyperUnique for that case. And obviously “cardinality” metric can be added to dataschema.
Have you looked into using datasketches for doing complex set operations with approximate count distinct queries?