Anyone here has done any experiment trying to get graph queries on Druid? We are building google analytics like system, where we give metrics on how many users went from page A to page B, etc. We are already using Druid for some parts of our system and are looking to see if we can leverage for this use-case as well.

I am thinking of doing an adjacency-list representation of a graph, the vertices (web pages) are kind of static, I am thinking of putting this into a separate datasource. The set of edges (visits from one page to another) are always in motion and I am thinking of having a datasource with ids pointing to the source and destination page.

We expect edges close to 2 billion per day and I want to see if we can scale that kind of writes on Druid (at the same time, have the reads faster as well).

If you have any ideas/suggestions, please pass those on. I want to make sure I experiment with Druid before going to specialised graph DBs.

Setups like this are fairly common. It sounds like you are looking more to build funnel analysis, which you should take a look at using theta sketches to do.