[druid-user] Dimension does not show up when put in dimensionExlcusions

that is the correct behavior, druid will act if the column is null filled if it does not exist. That is how we can support schema evolution.

To check the actual dimension per data source you can use segment metadata query that can return the list of existing dimensions per segment.

I need to be able to query the dimension but also make sure it is not indexed as the value of the dimension is really long and unique. How can I achieve that?

Sorry Not sure i am getting what you mean by query the dimension and in the same time it is not indexed it has to come from somewhere ?

If you need to query the dimension then it need to be part of the rows indexed by druid.

what kind of query you need to issue against this dimension ?

In case of it is an ID kind of thing and you want to keep track of unique ids you can use the complex aggregators like hyper log log or sketches

I mean like how do I retrieve the value? It doesn’t seem to be storing it in Druid. I put a dimension in the dimensionExclusions column and when I try to do a search query the dimension does not show up?

As per the doc states if the dimension is listed on the dimensionExclusions list it will be excluded from the indexing.

Not sure what you are trying to achieve ?

Well I need to get the value somehow even if its excluded from indexing. I need it to be store in Druid just not indexed.

How you want to get the value if you elected to exclude it that will not work?

You need to either index it or find a good way to summarize it in some form that you can query it afterwards.

For example it is common to index unique IDs as a complex sketch type like hyperloglog then you will be able to do approximate unique count of IDs grouped by/ filtered by your set of dimensions.

I hope you get the point.

Sorry a little confused. What is the point of putting a value in dimensionExclusion then? Why not skip it entirely if you can’t retrieve the value of it?

You can set dimension list to empty and select what you want to exclude.

this allow you to not specify the schema upfront and still you can exclude some dimension

Therefore you can still do schema less ingestion and selectively exclude some dims.
Hope you get the point !