Dynamic schemas

Hi all,

Is there any ability in Druid to add columns to the pre-aggregated data on an ad-hoc basis?

As I see it, in order for Druid to be able to run an aggregating query, all pre-aggregation calculations must be present in the indexed data source. This means that, if you want to start running a new analytic query that depends on a pre-aggregation calculation (colA * colB), if you don’t have that already pre-calculated as e.g. colC in the data loaded into the Indexing Service, then you would have to add that calculated column to your source data and re-index it.

Is this the case, or is there a clever way around it?

Thanks,

Nick

Different segments can have different schemas. You can easily add new columns to new segments. You’ll need to batch index or run delta ingestion to add columns to old segments.

For the specific example you are talking about, Druid does not currently support those operations, but I think Druid’s cache will solve many of the thoughts you have.
http://druid.io/docs/0.9.0-rc2/querying/caching.html

Thanks for the response, Fangjin. Pretty much what I thought. Will have a look into delta ingestion.