Null values not allowed in lookups?

Hi,

I’m trying to add some lookups based on a customJson file. The key is “userName”, and I’m adding one lookup per the rest of the values.

We don’t have all values for all users, so at first we tried to leave out some of the values, which caused the following exception. We then tried by adding all known column values but setting them to null instead. But that doesn’t seem to work either.

java.lang.NullPointerException: Value column [country] missing data in line [{“userName":“username”,“country”:null,“email”:"user@example.com”,“gender”:“U”,“cellphone”:"+123456789",“age”:“1983”}]

at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:253) ~[guava-16.0.1.jar:?]

at io.druid.query.lookup.namespace.URIExtractionNamespace$DelegateParser.parse(URIExtractionNamespace.java:220) ~[?:?]

at io.druid.data.input.MapPopulator$1.processLine(MapPopulator.java:67) ~[?:?]

at com.google.common.io.CharStreams.readLines(CharStreams.java:317) ~[guava-16.0.1.jar:?]

at com.google.common.io.CharSource.readLines(CharSource.java:239) ~[guava-16.0.1.jar:?]

at io.druid.data.input.MapPopulator.populate(MapPopulator.java:59) ~[?:?]

at io.druid.server.lookup.namespace.URIExtractionNamespaceCacheFactory$1$1.call(URIExtractionNamespaceCacheFactory.java:177) ~[?:?]

at io.druid.server.lookup.namespace.URIExtractionNamespaceCacheFactory$1$1.call(URIExtractionNamespaceCacheFactory.java:130) ~[?:?]

at com.metamx.common.RetryUtils.retry(RetryUtils.java:60) [java-util-0.27.9.jar:?]

at com.metamx.common.RetryUtils.retry(RetryUtils.java:78) [java-util-0.27.9.jar:?]

at io.druid.server.lookup.namespace.URIExtractionNamespaceCacheFactory$1.call(URIExtractionNamespaceCacheFactory.java:128) [druid-lookups-cached-global-0.9.1.1.jar:0.9.1.1]

at io.druid.server.lookup.namespace.URIExtractionNamespaceCacheFactory$1.call(URIExtractionNamespaceCacheFactory.java:73) [druid-lookups-cached-global-0.9.1.1.jar:0.9.1.1]

at io.druid.server.lookup.namespace.cache.NamespaceExtractionCacheManager$4.run(NamespaceExtractionCacheManager.java:361) [druid-lookups-cached-global-0.9.1.1.jar:0.9.1.1]

at com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator$NeverSuccessfulListenableFutureTask.run(MoreExecutors.java:582) [guava-16.0.1.jar:?]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_91]

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_91]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_91]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_91]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]

regards,

Robin

Hey Robin,

Null values in lookups seem useful for your case (many logical lookups derived from a single JSON file). I raised https://github.com/druid-io/druid/pull/3512 to allow that to work.

With the current Druid version, some other workarounds are:

  1. Split the single physical file into one file per logical lookup, and omit rows with null values.

  2. Use a placeholder value like “Unknown country” or “NULL” instead of an actual null. You can pair this with “replaceMissingValueWith” : “your-placeholder-here” at query time to make sure that true nulls and your placeholders are folded together.

Have you tried mapping things you want reported as missing to the empty string?
“”

``