Thanks to the whole community for open sourcing Druid .
I was pretty amazed by the exciting features Druid offers and I am planning to shift to Druid for real time data analysis. So we are a small team working on integrating various advertisement platforms (Facebook, pinterest) with the analytic platforms (Google analytics, DAX) to help marketers generate some meaningful insights from the data and optimize their ad performances. Currently we are pulling data from Facebook and Google and storing it in a postgres DB. The frequency of above operation is like every 30 mins (but in future it may reach every 5 mins or so).
Since Data in Druid is saved in forms of DataSources rather than tables, I am defining a dummy datasource according to my use-case :
the fields of the datasource will be as following:
- So if I want to use multi - dimensional query like “What is the total spend and revenue for the period timestamp1 to timestamp2 where gender was X and city was Y and likes is Z”
What would be the latency with respect to time if simultaneously 100 queries are executed from the front-end and if we increase the dimensions(3 in above query) to around 10-15.
Can we query across different DataSources at once or in a single query ?
For now I want to import the data from my postgres Database to Druid. The batch ingestion tutorial required Hadoop for the same. I am not very comfortable in using Hadoop so is their an alternate way for batch ingestion or importing the data into Druid ?
Real time ingestion requires Apache Kafka to be integrated with Postgres , where postgres acts as a producer and then kafka as a consumer can pass the real time data added to druid real time ingestion nodes. But I’m facing problems integrating postgres with kafka since not many open source tools are available of which most are broken. I need help with that too ? Druid forum might not be the best place to ask how to integrate Kafka with Postgres but any alternative solution to this problem from an experienced community can be of great help .
Thanks a lot for the help . A lot more questions coming down the way after setting up Druid and integrating it with Postgres.