Is there a way to implement listFiltered dimensionSpec equivalent in SQL

Hi All,

I have a multi value dimension field in druid.
And I use SQL queries to retrieve the data.

I want to filter the multi-value field after grouping on it.
As per the documentation : https://druid.apache.org/docs/latest/querying/multi-value-dimensions.html

“dimensions”: [ { “type”: “listFiltered”, “delegate”: { “type”: “default”, “dimension”: “tags”, “outputName”: “tags” }, “values”: [“t3”] } ]

listFiltered type can be used to filter the multi-value dimensions in native queries.

Can this be achieved in the sql queries?

I’m asking the same question:
the dimensionSpec are useful, how can we use it in the SQL world?
Thanks!

Multi-value strings

Druid’s native type system allows strings to potentially have multiple values. These multi-value string dimensions will be reported in SQL as VARCHAR typed, and can be syntactically used like any other VARCHAR. Regular string functions that refer to multi-value string dimensions will be applied to all values for each row individually. Multi-value string dimensions can also be treated as arrays via special multi-value string functions, which can perform powerful array-aware operations.

Grouping by a multi-value expression will observe the native Druid multi-value aggregation behavior, which is similar to the UNNEST functionality available in some other SQL dialects. Refer to the documentation on multi-value string dimensions for additional details.

https://druid.apache.org/docs/latest/querying/sql.html#multi-value-strings

https://druid.apache.org/docs/latest/querying/sql.html#multi-value-string-functions

Based on this, I think you can use a WHERE or HAVING clause using the multi-value-string-functions, but I haven’t tried it myself.

Hi Rachel,

Thank you for your reply!
In my use case, I’m more focused on the usage of Lookup DimensionSpecs and Extraction DimensionSpec on regular string dimensions.
In my native query, I can put such a json into the Dimension, to treat US as US, and treat any other countries to be other before the aggregation.
So even though I have 200 countries in my dataset, when I group by country using this lookup dimensionSpec, I get only 2 groups in my output: the US and the other.
{
“type”: “lookup”,
“dimension”: “country”,
“outputName”: “country”,
“replaceMissingValueWith”: “other”,
“retainMissingValue”: false,
“lookup”: {
“type”: “map”,
“map”: {
“US”: “US”
},
“isOneToOne”: false
}
}

and Extraction DimensionSpec is even more flexible than Lookup DimensionSpecs. Are these functions also support in SQL? Or for now, it’s a native-only function?
Thanks!

How about using a MV_CONTAINS(arr,expr) in a CASE statement. You can also use lookups in SQL.

https://druid.apache.org/docs/latest/querying/sql.html#string-functions

I don’t have a multivalue data set to try it on, so this is just brainstorming.