How do we determine the number of shards for hashbased indexing

Auto Generated Inline Image 2.png

3:41 PM (5 minutes ago)

All,

I am new to Druid , we have done the data ingestion and able to query the data . We are trying to create hash index on few dimensions like SSN,DOB etc and have the below questions

  1. How do i create different buckets for different dimensions

  2. How do i determine the number of shards required?

  3. Is there any relation between number of shards and partition dimensions

Thanks,

Ravali.

Answers inline.

3:41 PM (5 minutes ago)

All,

I am new to Druid , we have done the data ingestion and able to query the data . We are trying to create hash index on few dimensions like SSN,DOB etc and have the below questions

  1. How do i create different buckets for different dimensions

What do you mean by different buckets?

  1. How do i determine the number of shards required?

Generally try to keep segments a few hundred megabytes in size.

  1. Is there any relation between number of shards and partition dimensions

If you are using single dimension hashed partitioning, same dimension values should end up in the same shard.