I know splunk supports this. Imply claims it supports this.
does default open-source druid support this?
I have a field or dimension XYZ whose content is a big json string with nested field.
could we do run-time query to extract some nested fields from XYZ and do aggregation?
For my own clarification: the nested json data has already been ingested as a string dimension, and you’d like to extract some of those fields and then aggregate those fields?
Mark, yes, that’s the use case.
most of the queries are made on the extracted & transformed dimensions + metrics.
however, we want to keep one field( big raw json string with nested structure), in case we want to have an ad-hoc query on some fields we later feel relevant to query, but not at ingestion time.
This might have been the Imply (my employer) reference you made in your original post, regarding Nested columns. If I’m wrong, please feel free to correct me. I am personally hopeful that this feature will make its way into OS Druid. I’m sure that your use case is not unique.
Unfortunately, as of this date, I don’t see a way to replicate the functionality without re-ingesting, flattening, and potentially transforming.
thank you mark, so imply does not have this feature as well?
Imply distribution has support for nested json support and has the capability to store JSON in a native manner. I am not sure about the plan to make that extension to open source.