Hi all,
Our native tasks fail occasionally with the following exception:
2019-06-05T17:52:44,338 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.common.task.IndexTask - Encountered exception in BUILD_SEGMENTS.
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.io.IOException: java.lang.NullPointerException
at org.apache.druid.data.input.impl.prefetch.Fetcher.checkFetchException(Fetcher.java:200) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.data.input.impl.prefetch.Fetcher.openObjectFromLocal(Fetcher.java:221) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.data.input.impl.prefetch.Fetcher.next(Fetcher.java:174) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.data.input.impl.prefetch.PrefetchableTextFilesFirehoseFactory$2.next(PrefetchableTextFilesFirehoseFactory.java:223) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.data.input.impl.prefetch.PrefetchableTextFilesFirehoseFactory$2.next(PrefetchableTextFilesFirehoseFactory.java:209) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.data.input.impl.FileIteratingFirehose.getNextLineIterator(FileIteratingFirehose.java:90) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.data.input.impl.FileIteratingFirehose.hasMore(FileIteratingFirehose.java:67) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:997) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.indexing.common.task.IndexTask.run(IndexTask.java:466) [druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:421) [druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:393) [druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_212]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_212]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_212]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: java.lang.NullPointerException
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_212]
at java.util.concurrent.FutureTask.get(FutureTask.java:206) ~[?:1.8.0_212]
at org.apache.druid.data.input.impl.prefetch.Fetcher.checkFetchException(Fetcher.java:188) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
… 14 more
Caused by: java.io.IOException: java.lang.NullPointerException
at org.apache.druid.data.input.impl.prefetch.FileFetcher.download(FileFetcher.java:109) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.data.input.impl.prefetch.Fetcher.fetch(Fetcher.java:135) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.data.input.impl.prefetch.Fetcher.lambda$fetchIfNeeded$0(Fetcher.java:111) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
… 4 more
Caused by: java.lang.NullPointerException
at org.apache.druid.data.input.impl.prefetch.FileFetcher.lambda$download$0(FileFetcher.java:97) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:86) ~[java-util-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:125) ~[java-util-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.data.input.impl.prefetch.FileFetcher.download(FileFetcher.java:95) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.data.input.impl.prefetch.Fetcher.fetch(Fetcher.java:135) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
at org.apache.druid.data.input.impl.prefetch.Fetcher.lambda$fetchIfNeeded$0(Fetcher.java:111) ~[druid-api-0.13.0-incubating.jar:0.13.0-incubating]
… 4 more
``
Our middlemanagers run only native tasks, we have 7 instances which has 8 cores and 32GB RAM and we run 6 workers per host.
We get this exception for 1%-2% of the tasks we run locally (native) which is pretty significant for us.
Druid version: 0.13
Let me know if you need more info regarding this issue.
Any help will be greatly appreciated.
Thanks,
Shachar