Single-dimension partitioning not working

In the docs for the PartitionsSpec ,
under “single-dimension partitioning” it says “Each segment will contain all rows with values of that dimension in that range.”

Follow the above. I tried to use the partition-spec as below

“partitionsSpec” : {
“type”: “dimension”,
“targetPartitionSize”: 100000,
“partitionDimension”: “category”
}

But I don’t see the partitions based on the category In the sample dataset which I used to ingest I had around 15 distinct categories spread across 3 years. That means it should have created one segment for each of the category. But it looks like that the segementgranularity is taking the precedence as I can see only the 3 segments getting created.

Our requirement is such that after the initial ingest our data is updated by category(which is one of the dimension) so we may receive an update for a particular category. Our idea is to re-ingest the data for that category and target only a particular segment to avoid data loss.

Is there any other possibility we can have a segment for each category?

Hi,
Single dimension partitioning does not create separate segment for each value instead it partitions data based on ranges of a single dimensions. Each segment/shard gets a start and end value for that dimension. Having a separate segment when partitioning on a high cardinality dimension would be really inefficient as it can potentially create tons of segments.

I think your best bet here would be to reindex data for a range instead of single value.