what is the max limit on Dimension and Cordinality count

We have started to use Druid for our project and the version currently we use is 0.7.1.1. In order to decide to support a high cordinality dimension and for other reasons , I have the following questions

  1. What is the max. no of dimensions we can have? Have we done any benchmarking and any advisable count?
  2. What is the max cordianlity a dimension can have? OR what kind of problems we need to compromise if we go beyond certain limit (anf of course what is that limit)?

Can you please answer these questions?

//Sithik

Hi Sithik,
See Inline

Also, There are many improvements and big fixes in druid since 0.7.1.1

I would also recommend you to switch to 0.8.1

Thanks Nishant for the quick reply.

From other thread I already got to know that people have explored upto 100 dimensions, but we have got a requirement to support 150+ dimensions
and that’s the reason I wanted to check the cap on the dimensions we can have.
2. If I understand it correctly, from HyperUnique aggregator / hyperloglog sketches, we can just get the cardinality of that dimension nothing else. Suppose If I want to get the metric count by applying filter on this high cardinality dimension and low cardinality dimension, I don’t think it’s possible by keeping high cardinality dimension as hyperloglog sketches ? More over we might need
to bear the cost of HyperUnique aggregation during the ingestion, if so
any stats available on this?

Sure we will migrate to 0.8.1

Thanks,
Sithik

Inline.

Thanks Nishant for the quick reply.

From other thread I already got to know that people have explored upto 100 dimensions, but we have got a requirement to support 150+ dimensions
and that’s the reason I wanted to check the cap on the dimensions we can have.

Folks have been successful with thousands of dimensions. 150+ is fine.

  1. If I understand it correctly, from HyperUnique aggregator / hyperloglog sketches, we can just get the cardinality of that dimension nothing else. Suppose If I want to get the metric count by applying filter on this high cardinality dimension and low cardinality dimension, I don’t think it’s possible by keeping high cardinality dimension as hyperloglog sketches ? More over we might need
    to bear the cost of HyperUnique aggregation during the ingestion, if so
    any stats available on this?

Watch this talk:

Thanks Fangjin Yang. Let me go through the video.