I have a specific use case, I am wondering if anyone else had dealt with a similar situation, and could share some thoughts.
I am planing to use druid for loading variety of datasets – their schemas may be completely different & dynamic. So, imagine different types of timeseries data with different dimensions and metrics. The problem is, what happens when the dataset doesn’t have a timestamp as one of the columns? If we use a fake fixed timestamp value, will it result in too much performance degradation in Druid?
My 75% cases are covered with the datasets that have timestamp as one of the columns, but the rest may not have timestamp, so I am wondering what is the best way to deal with this. Druid seems to provide several benefits such a column oriented store and compression, and ability to filter fast using bit-map index, etc…but not having time based segmentation…will that affect performance? if yes, how much?
Any thoughts/comments would be greatly appreciated!