Clarification on lookups

I have loaded the namespace lookup on just the broker node.

  1. I am running the following query,

{

“queryType”: “topN”,

“dataSource”: “some_source”,

“granularity”: “all”,

“dimension”: {

“type” : “extraction”,

“dimension” : “some_dim”,

“outputName” : “dim”,

“extractionFn” : {

“type” : “lookup”,

“lookup”:{

“type”:“namespace”,“namespace” : “test”

},

“retainMissingValue”:true,

“injective”:true

}

},

“threshold”: 50,

“metric”: “count”,

“aggregations”: [

{ “type” : “count”, “name” : “count”, “fieldName” : “count” }

],

“intervals”: [ “2016-02-20T00:00:00.000/2016-02-21T00:00:00.000”]

This query gives the following error: "Could not resolve type id ‘namespace’ into a subtype of [simple type, class io.druid.query.extraction.LookupExtractor].

But If I set “injective” = true in the query, then the query runs fine.

So, I wanted to know if this is the intended way and what all does setting “injective” = true do.

  1. I wanted to have a similar functionality while doing a lookup on a filter. So, basically broker should do the (reverse) lookup and should send transformed value to historical node and there should be no need of loading lookups on historical nodes. As per the documentation here http://druid.io/docs/0.9.0-rc2/querying/dimensionspecs.html ,

---------------- A property optimize can be supplied to allow optimization of lookup based extraction filter (by default optimize = true). The optimization layer will run on the broker and it will rewrite the extraction filter as clause of selector filters. -------------

I am doing the following query-

{

“queryType”: “timeseries”,

“dataSource”: “some_source”,

“intervals”: [

“2016-02-25T00:00/2016-02-26T01:00”

],

“granularity”: “all”,

“filter”: {

“type”: “extraction”,

“dimension”: “some_dim”,

“value”: “United States”,

“extractionFn”: {

“type”: “lookup”,

“optimize”: true,

“lookup”: {

“type”: “namespace”,

“namespace” : “test”

}

}

},

“aggregations”: [

{

“name”: “count”,

“type”: “longSum”,

“fieldName”: “count”

}

]

So, setting “optimize” = true should send the transformed query to the historical node and there should be no need of loading lookups on historical nodes.

But this still gives the error: "Could not resolve type id ‘namespace’ into a subtype of [simple type, class io.druid.query.extraction.LookupExtractor].

I have checked the logs on the historical nodes, the query is not transformed and still has the extraction function passed in the query to broker.

Thanks for contacting us Saksham,

Could you follow some of the debugging steps outlined in https://groups.google.com/d/msg/druid-user/YUj36m4s3Pc/4ndtQpgWDQAJ ?

Specifically, try using -Dlog4j.configurationFile=log4j2.debug.xml when you launch the brokers or historicals or as javaOpts for the peons?

You should see more debug information for io.druid.server.namespace.cache.NamespaceExtractionCacheManager which should be telling you which ones it is loading.

Also, as a sanity check, the lookup stuff you described is in an extension, do you have the druid-namespace-lookup extension on your broker/historical/peon nodes?

I have the namespace extension on the broker node. I want broker to handle all the lookups and don’t want to load the lookups on other nodes.

So, setting “injective” = true achieved that in the first query and I think setting “optimize” = true should have achieved the same in the second query but that is not the case.

which version of druid are you using ?

Broker is running on 0.9.0-rc2 and historical on 0.8.3.

i am guessing your unapply is returning an empty list https://github.com/b-slim/druid/blob/d4f00096ff5f53235432380f37b3927547b1eade/processing/src/main/java/io/druid/query/filter/ExtractionDimFilter.java#L105-L105
That’s why you get the same filter.
Let me think about this and submit a fix.
I Guess when you unapply has not a mapped value you want to map to null ?

I don’t think that’s the case since the lookup value is in the table.

I think there should be some options like “replaceMissingValueWith” or “retainMissingValue” to handle the cases when it can’t unapply.