We’ve been successfully using Druid for some months and now we want to extend its capabilities.
We currently have an event database which is very similar to the one that comes by default on the druid installation:
timestamp publisher advertiser gender country click price
With the difference that every row has a unique user identifier.
What we want to do is to aggregate data for every unique user, and present that information inside another table, we like to call it “User Table”.
The main goal is to have 1 row = 1 user, with many columns that describes how is that user. Also some of those columns need to be multi-value colums.
Why do we need that? Because there are some queries that we are not able to execute with the given dataset format. We need to re-format it in a user table.
How can we achieve that? We need to use Druid outstanding performance on their indexes, but the problem is that the format is timestamp oriented. We would like to update users, pretty much something like “update” statement on SQL databases.
Any feedback or comment would be helpful, thanks a lot for your time!