Lookup usage during ingestion

Hi,

I am using Druid 0.16.

I have lookup defined, and it works fine during query time (SQL). I am trying to use lookup in ingestion spec.

According to documentation (http://druid.incubator.apache.org/docs/latest/misc/math-expr.html), lookup ii ingestion spec is possible. When i try the following

“transforms”: [

{

“type”: “expression”,

“name”: “location_desc”,

“expression”: “lookup(“DOLocationID”,‘location_to_zone’)”

}

]

It results in error stating “Lookup [location_to_zone] not found”.

My question is, is it possible to use lookup during ingestion? If yes, is there any specific configuration to be done in lookup definition?

Please help.

Regards, Chari.

Hi Lakshminarayana,
I think it is possible to use lookups during ingestion as well.

Can you post your complete ingestion spec?

Is it batch or streaming ingestion?

Thank you.

–siva

Hi Siva,

Attached file:

  • sample data to ingest

  • ingestion spec

  • lookup definition json

  • sample lookup mapping data.

Regards, Chari.

ingest_lookup_spec.json (1.4 KB)

location_zone_lookup.csv (5.29 KB)

lookup_definition.json (388 Bytes)

yellow_tripdata_sample.csv (1.11 KB)

Hey Chari,

Does the lookup work at query time?

As far as I understand it, any lookups that work at query time should work at ingest time too. If that’s not the case it might be something we should look into. I’d start by double-checking the following,

  1. The lookup is properly loadable and works at query time.

  2. You still have druid.lookup.enableLookupSyncOnStartup = true (as is the default). This is important because otherwise, the lookup won’t be loaded when the ingestion engine initializes.

Gian

Hi Gian,

I jumped the gun quickly. Thanks for your reply.

After changing druid.lookup.enableLookupSyncOnStartup = true (default was false), the ingestion works fine.

Works: lookup(“DOLocationID”,‘location_to_zone’)

does not work: lookup(cast(“DOLocationID” as string),‘location_to_zone’)

Is it not possible to pass nested transformations during lookup? Usecase is say “customer_id” is a column of type numeric. We want to ingest it as numeric (as we dont want to create indexes on these columns). But using lookup on this column we want to populate “first_name” for each “customer_id”. In such as case we have to use cast and then lookup.

Regards, Chari.