[druid-user] Extract data from a column with JSON string


I have a String column with JSON data. I want to extract individual key/value from this JSON string e.g. I would like to extract “type” from below JSON stored as String

{type=MBS, pool={MaturityDate=20211216, Key=US01F0124C76, Key1=IR.MBS.USD, Yield=100.30859375}}

I have already extracted/flattened the data while ingestion but I would like to see if I can extract with query.

Is there a way to do this ?


Hey Arpit;

Two places to go hunting:

First, the standard SQL expressions - there may be something here: SQL · Apache Druid

Then there’s Native query expressions in case you find anything so you can use a Native versus SQL query – I took a quick look but couldn’t see any like Xpath things specifically - but I may not have had my eyes open !!! :smiley: (I believe actually that SQL expressions are translated into Native expressions by Calcite so if you can’t find it here, there may not be something in SQL at all — but I have known to be wrong!)

Then I wondered – could you form a REGEX to pull it out? Or maybe some other kind of string manipulation?

There’s also the option of using Javascript functions – you can read more about that here:

This does sound like a good idea to develop the available query-time functions…

Thanks for the help.
I managed to extract a key/value using REGEXP scalar function. For example, to get type=MBS from the example I quoted above, I wrote below REGEX
REGEXP_EXTRACT(, ‘type=[a-zA-Z]+’)