Global Cached Lookup - json/csv on HDFS

Hi,
We have a mapping data (id to description) in both csv and json format.

We are using following spec snippet:

“lookupExtractorFactory”: {

“type”: “cachedNamespace”,

“extractionNamespace”: {

“type”: “uri”,

“uri”: “file:/home/user/zip_to_city_mapping.json”,

“namespaceParseSpec”: {

“format”: “simpleJson”

}

It is working fine in a single machine cluster. In a multi node cluster, do we need to copy on all historical nodes?

Or instead of specifying local file path, can i refer to hdfs path to providing mapping data file.

Please help. Thanks in advance.

You need to make the file accessible by all nodes. Either copy it to every machine or use NFS etc.

Eric Graham

Solutions Engineer -** **Imply

**cell: **303-589-4581

email: eric.graham@imply.io

www.imply.io

Hi
May I know which version o0f Druid are you using?

Thank Eric. That helps.

Hi Prabakaran,
Iam using 0.15.0.