Selector filter with lookup extraction function not working

Hi,

I’ve set up a local Druid 0.10.0 installation along with cached lookups (lookups-cached-global extension).

The documentation for both latest and 0.10.0 states that using and extraction filter along with a registered lookup exraction as I used to do is now deprecated. So I tried out using a Selector Filter along with a Lookup Extraction Function as stated in the documentation:

http://druid.io/docs/latest/querying/dimensionspecs.html#extraction-functions

{
  "type":"lookup",
  "lookup":{"type":"namespace","namespace":"some_lookup"},
  "replaceMissingValueWith":"Unknown",
  "injective":false
}

However, the type “namespace” isn’t found.

Following is the query I issued. I know that the lookup namespace is working because if I use a registeredLookupExtractionFunction, then everything works:

{
“queryType”: “topN”,
“dataSource”: {
“type”: “table”,
“name”: “test-01”
},
“intervals”: “2010-01-01/2020-01-01”,
“threshold”: 1000,
“context”: {
“finalize”: true,
“priority”: 10002,
“queryId”: “abc”,
“useCache”: false,
“populateCache”: false,
“timeout”: 180000
},
“dimension”: {
“type”: “lookup”,
“name”: “activitytype_by_activitytypeid”,
“dimension”: “activityTypeId”,
“outputName”: “activityType”,
“retainMissingValue”: false,
“replaceMissingValueWith”: “Unknown”,
“injective”: true,
“optimize”: true
},
“filter”: {
“type”: “selector”,
“dimension”: “countryCode”,
“value”: “United States of America”,
“extractionFn”: {
“type”: “lookup”,
“lookup”: {
“type”: “namespace”,
“namespace”: “country_by_countrycode”
}
}
},
“metric”: {
“type”: “numeric”,
“metric”: “auctions”
},
“granularity”: {
“type”: “all”
},
“aggregations”: [
{
“type”: “doubleSum”,
“name”: “auctions”,
“fieldName”: “auctionCount”
}
],
“descending”: false
}

I get the following error message back:

1. {
1.    "error": "Unknown exception",
1.    "errorMessage": "Could not resolve type id 'namespace' into a subtype of [simple type, class io.druid.query.lookup.LookupExtractor]
   at [Source: HttpInputOverHTTP@33a7a9a0[c=1125,q=1,[0]=EOF,s=STREAM]; line: 34, column: 11]",
1.    "errorClass": "com.fasterxml.jackson.databind.JsonMappingException",
1.    "host": null
1. }

   

``

The reason for why I would like to try out the new syntax is that with the old syntax forms for lookups, I observe that the brokeris correctly optimizing a lookup used in a dimension spec into such that the query sent to the historicals doesn't contain the lookup anymore but has the dimension IDs instead. However this does NOT work for the a registered lookup extraction within an extraction filter. Ifen if I specify injective= true and optimize=true, the query sent to the historical still contains the lookup definition.

I want to achieve that as long as I use injective lookups in filter conditions and dimension splits, the historicals will not get to know about them but the broker would rewrite them.

Instead, if issue the following query to the broker:

{
  "queryType": "topN",
  "dataSource": {
    "type": "table",
    "name": "test-01"
  },
  "intervals": "2010-01-01/2020-01-01",
  "threshold": 1000,
  "context": {
    "finalize": true,
    "priority": 10002,
    "queryId": "abc",
    "useCache": false,
    "populateCache": false,
    "timeout": 180000
  },
  "dimension": {
    "type": "lookup",
    "name": "activitytype_by_activitytypeid",
    "dimension": "activityTypeId",
    "outputName": "activityType",
    "retainMissingValue": false,
    "replaceMissingValueWith": "Unknown",
    "injective": true,
    "optimize": true
  },
  "filter": {
    "type": "selector",
    "dimension": "countryCode",
    "value": "United States of America",
    "extractionFn": {
       "type": "registeredLookup",
       "lookup": "country_by_countrycode",
       "injective": true,
       "optimize": true
     }
  },
  "metric": {
    "type": "numeric",
    "metric": "auctions"
  },
  "granularity": {
    "type": "all"
  },
  "aggregations": [
    {
      "type": "doubleSum",
      "name": "auctions",
      "fieldName": "auctionCount"
    }
  ],
  "descending": false
}

``

 ..then the query that the broker sends to the historical only has the dimension split optimized but still contains the lookup definition within the filter:

{
  "queryType": "topN",
  "dataSource": {
    "type": "table",
    "name": "test-01"
  },
  "virtualColumns": [
   
  ],
  "dimension": {
    "type": "default",
    "dimension": "activityTypeId",
    "outputName": "activityType",
    "outputType": "STRING"
  },
  "metric": {
    "type": "numeric",
    "metric": "auctions"
  },
  "threshold": 1000,
  "intervals": {
    "type": "segments",
    "segments": [
      {
        "itvl": "2016-02-15T13:00:00.000Z/2016-02-15T14:00:00.000Z",
        "ver": "2016-03-20T16:50:09.754Z",
        "part": 0
      }
    ]
  },
  "filter": {
    "type": "selector",
    "dimension": "countryCode",
    "value": "United States of America",
    "extractionFn": {
      "type": "registeredLookup",
      "lookup": "country_by_countrycode",
      "retainMissingValue": false,
      "replaceMissingValueWith": null,
      "injective": true,
      "optimize": true
    }
  },
  "granularity": {
    "type": "all"
  },
  "aggregations": [
    {
      "type": "doubleSum",
      "name": "auctions",
      "fieldName": "auctionCount",
      "expression": null
    }
  ],
  "postAggregations": [
   
  ],
  "context": {
    "finalize": false,
    "populateCache": false,
    "priority": 10002,
    "queryId": "abc",
    "timeout": 180000,
    "useCache": false
  },
  "descending": false
}

``

So, my questions would we
- it seems to me that the documentation states a syntax for the namespaced extraction within filters which actually does not work. Can somebody confirm this or am I making a mistake somewhere?
- how can I chieve that the broker rewrites injective lookups within both the dimension and the filter section?
- how can I switch to the new syntax for using cached lookups in a filter expression if the "Registered Extraction Functins" are not the recommended way anymore.

thanks