Thanks Vadim for the reply. Please see my replies inline.
I’ll try to answer your questions in-line
- Ability to run reports on realtime data(possible with druid)
This is a very common use case of Druid. You will want to look into http://druid.io/docs/latest/ingestion/stream-ingestion.html
- Support for multi tenancy(druid supports multitenancy)
Druid is built from the ground up to excel at this.
- Dynamically added more dimensions to an existing row. As segments are immutable is it possible to do .?
This is not possible to do directly. One of the reasons that Druid is so fast is because of rollup (http://druid.io/docs/latest/design/index.html#roll-up) as such the original concept of a row is lost when segments are created. You will need to recreate the segment (re-index the data for that timeframe) or define your dimension in a derived way. It is common that people set up query time lookups (http://druid.io/docs/latest/querying/lookups.html) to dynamically map between data ids and variable data labels. Could you shed more light on your use-case?
Lets take an example. We are fetching transactions by customer dimension.After few days we introduced a new dimension called location and need to query transactions by location. As per your answer we need to recreate segment everytime new introduce a new dimension.
- Each row in DB may not have same dimensions .Dimensions vary by transaction. As we introduce new transactions , new dimensions and measures are created .(Should I use schemaless dimensions ?)
Sounds like you need to setup Schemaless dimension ingestion.
- Can we add new measures / edit measures on an existing row.
Same as the answer to (3) you can not create a new column without reindexing the data but you can create very elaborate derived measures.
For example if you had a dimension “country” and a measure “revenue” you could add derived measures like SUM(
country = “USA”) [in PlyQL syntax]
- Each dimension should have key and value. Key will never change and value will change. How to define dimension spec to have key and value
Can you elaborate on this? Do you mean each row?
For example lets take customer dimension. Each customer will have Id and name(id:1,name:john). Name of the customer may change.But id remains same. How to save both
id and name of dimension in the segment ?