High dimensionality spacial index

Hey guys!

Checking out Druid, looks fantastic. Does anyone have any experience with implementing a spacial index with around 200-300 dimensions? A Kd-tree is not efficient at such high dimensions, and I’m hoping someone has experience, or any ideas, enlightening me as to whether Druit will handle this type of query well.

Ideally, I’d like to say: “give me the top N documents within an X radius of some point”. Where that radius defines an N-dimensional sphere (N ~ 200).

Look forward to any advice!

Cheers,

Adrian :slight_smile:

Hi Adrian, the spatial indexes are experimental in Druid. I know of some deployments that use them, but I am not sure at what scale. The spatial indexes use a rd-tree, not a Kd-tree. It would be interesting if you guys did some experiments to measure the performance of the indexes. The spatial indexes should definitely get some love in the near future.

Also note spatial dimensions are Euclidian, so adjust your queries accordingly.

Awesome! I’m going to run a few tests over the next few days… I’ll let you know how I go! :slight_smile: