Help with lookups

I have placed the below lookup in each of broker, hostorical and overlord configs.

druid.query.extraction.namespace.lookups=[{“type”:“jdbc”,“namespace”:“ugdlookup”,“connectorConfig”:{“createTables”:true,“connectURI”:“jdbc:sqlserver://HOST:PORT;DatabaseName=DBNAME”,“user”:“XXXXX”, “password”:“XXXXX” }, “table”:“CM_DEVICE_TYPE_LOOKUP”, “keyColumn”:“device_type”, “valueColumn”:“device_type_name”}]

I have loaded the corresponding namespace-lookup extension in the common config also.

druid.extensions.coordinates=[“io.druid.extensions:druid-examples”,“io.druid.extensions:druid-kafka-eight”,“io.druid.extensions:druid-hdfs-storage”,“io.druid.extensions:mysql-metadata-storage”,“io.druid.extensions:druid-namespace-lookup”]

I have a pivot setup in which config.yaml I have added in dimensions:

  • name: ugd

expression: $ugd.lookup(‘ugdlookup’)

But Pivot is showing Query Error. Bad Status Code.

Also the broker log has 2016-02-25T12:14:35,614 INFO [qtp718286548-70] io.druid.server.QueryResource - Instantiation of [simple type, class io.druid.query.extraction.NamespacedExtractor] value failed: Namespace [ugdlookup] not found [37830403-49bc-49d6-9d34-5fd4abb26c61]

So, why is the broker not able to find ugdlookup namespace-lookup?

Hi Rishi,
as far as I know, I have put the namespaces declaration in the runtime.properties file of broker, historical, realtime, overlord and middleManager nodes.
You can verify if the broker (and any other of those nodes) successfully loaded the namespaces, by looking for a line in their log-file like this one:
2016-01-29T11:28:26,260 INFO [main] io.druid.server.namespace.NamespacedExtractionModule - Loaded 6 namespace-lookup configuration
Best,
Marco

In the logs of the broker I did find the line
2016-02-25T13:13:57,532 INFO [main] io.druid.server.namespace.NamespacedExtractionModule - Loaded 1 namespace-lookup configuration

Still, on querying by pivot it gives Namespace [ugdlookup] not found

I think you should load namespaces also on realtime nodes.
In any case, try to query Druid out of Pivot by composing a query like the one below (it is only an example of lookup usage).
Send that query via curl separately to the realtime, historical and broker nodes (changing the timeperiod according to your datasource hand-off parameters).
{
“queryType”: “groupBy”,
“dataSource”: “pulsar_event”,
“intervals”: [ “2016-01-12T10:40:00/2016-01-12T10:59:00” ],
“limitSpec”: {“type”:“default”,“limit”:5,“columns”:[{“dimension”:“Tot_Eve”,“direction”:“descending”}]},
“dimensions”:[“cap”,
“et”,
{
“type”: “extraction”,
“dimension”: “capQ”,
“outputName”: “Class”,
“extractionFn”: {
“type”:“lookup”,
“lookup”:{“type”:“namespace”,“namespace”:“sso_ec”},
“retainMissingValue”:true,
“injective”:false
}
}],
“metric”:“count”,
“granularity”: “all”,
“aggregations”: [
{“type”: “longSum”, “name”: “Tot_Eve”, “fieldName”:“count”}
]

}

It was required on the realtime node too.

Thanks Marco!! .

FYI, there are a lot of lookup improvements coming over the next few releases, roughly tracked through https://github.com/druid-io/druid/issues/2329

Realized logging is a bit lacking in the JDBC namespace stuff. Filed in https://github.com/druid-io/druid/pull/2548

Can you enable debug logging for io.druid.server.namespace.cache.NamespaceExtractionCacheManager ? You should see log messages when it is loaded up.

You should see Trying to update namespace [XXX] and Namespace [XXX] successfully updated in the logs

Note if you’re using stock druid you should be able to do -Dlog4j.configurationFile=log4j2.debug.xml to get debug logging for all druid classes

In documentation it was mentioned that something like this can be done for lookup

{ “type”:“lookup”, “dimension”:“dimensionName”, “outputName”:“dimensionOutputName”, “replaceMissingValueWith”:“missing_value”, “retainMissingValue”:false, “lookup”:{“type”: “map”, “map”:{“key”:“value”}, “isOneToOne”:false} }

``

Where should this config be placed? I placed it in dimentionSpec in tranquility kafka.json. But it didn’t take effect. I need to translate country Id to country name. So I can keep hardcoded country id to country name mapping using above config. But where should be put?

Hi Shantanu,

I do not know about how to use lookups from Pivot (I’ve never used it), but the mechanism you have included in your message is a Lookup Extraction Function, which goes directly into the query sent to Druid. So in your query, wherever you would specify a dimensionSpec you would include that JSON object. I expect there is a way to define a dimension in Pivot such that it uses the lookup - this is what I have done from Caravel.

Best of luck,

–Ben