Query dimension and namespace lookup using topN?


I have a query where I only need to select one dimension. I also have a namespace lookup configured in Druid for this dimension. I need to have the druid query return both the dimension value (an id in my case) as well as the namespace lookup value (which is a human readable name) for each record. I would like to use a topN query because it can be executed efficiently. However, it seems I am forced to use a groupBy query to return both the dimension value and the corresponding lookup value. I realize that in some cases two dimension values can map to the same lookup value and Druid will group by the lookup value. However, Druid supports the “injective” property which declares that there is a 1:1 mapping between dimension values and namespace lookup values. It should be possible to tell Druid to run a normal topN query and then perform the lookup mapping just before results are returned to the caller, perhaps in the broker.

Is there a way to get the performance benefits of a topN query which still having the query return both the dimension value and the corresponding lookup value. Thanks!

  • Brian

This is something that has been mentioned a few times. About the only solution I know of right now is to encode the results. For example, instead of having a map of
‘some_id’ -> ‘some_name’


have a map of

‘some_id’ -> ‘{“id”:“some_id”, “name”:“some_name”}’


and handle the json object at the final destination.

This is a workaround until a proper feature can be implemented, but I don’t know of that feature on anyone’s roadmap at the current moment.

I couldn’t find where a feature request issue would be so I added one:

please add any color as you see fit.

Thanks for opening the github issue and suggested workaround. I will give it a try.

  • Brian