Hadoop batch ingestion failure

Hi,

I have been running into ingestion failures intermittently. Here is the stack trace from the overlord logs

[task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_sales-rank-daily_2017-07-27T04:33:48.922Z, type=index_hadoop, dataSource=sales-rank-daily}]

java.util.NoSuchElementException

at java.util.ArrayList$Itr.next(ArrayList.java:854) ~[?:1.8.0_131]

at com.google.common.collect.Iterators.getOnlyElement(Iterators.java:297) ~[guava-16.0.1.jar:?]

at com.google.common.collect.Iterables.getOnlyElement(Iterables.java:285) ~[guava-16.0.1.jar:?]

at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:202) ~[druid-indexing-service-0.9.2.jar:0.9.2]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.2.jar:0.9.2]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.2.jar:0.9.2]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_131]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]

2017-07-27T04:34:02,584 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_sales-rank-daily_2017-07-27T04:33:48.922Z] status changed to [FAILED].

2017-07-27T04:34:02,586 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {

“id” : “index_hadoop_sales-rank-daily_2017-07-27T04:33:48.922Z”,

“status” : “FAILED”,

“duration” : 6574

}

Not sure where to get more information about the issue or how to go about debugging it. Attaching the ingestion spec file as well. Any help would be highly appreciated.

ingestion_spec.txt (2.7 KB)

Also, wanted to add that I am using druid version 0.9.2.

Our prod cluster stopped ingesting data due to this issue. Any pointers would really help. Thanks a lot!

The issue was due to metadata store. Once the mysql db was restarted, things started working normal.