Druid quick start batch data job failed

I just run:
curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/wikiticker-index.json localhost:8090/druid/indexer/v1/task

under druid-0.9.0


2016-05-31T02:54:08,598 INFO [Thread-42] org.apache.hadoop.mapred.LocalJobRunner - reduce task executor complete.
2016-05-31T02:54:08,632 WARN [Thread-42] org.apache.hadoop.mapred.LocalJobRunner - job_local1933875771_0002
java.lang.Exception: java.io.IOException: Unable to delete directory var/tmp/base5359313615494153816flush/merged/v8-tmp.
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.3.0.jar:?]
Caused by: java.io.IOException: Unable to delete directory var/tmp/base5359313615494153816flush/merged/v8-tmp.
	at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1541) ~[commons-io-2.4.jar:2.4]
	at io.druid.segment.IndexMerger.makeIndexFiles(IndexMerger.java:1010) ~[druid-processing-0.9.0.jar:0.9.0]
	at io.druid.segment.IndexMerger.merge(IndexMerger.java:421) ~[druid-processing-0.9.0.jar:0.9.0]
	at io.druid.segment.IndexMerger.persist(IndexMerger.java:193) ~[druid-processing-0.9.0.jar:0.9.0]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.persist(IndexGeneratorJob.java:501) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:672) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:469) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_11]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_11]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_11]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_11]
	at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_11]
2016-05-31T02:54:08,738 DEBUG [Thread-42] org.apache.hadoop.security.UserGroupInformation - PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:329)
2016-05-31T02:54:09,300 DEBUG [task-runner-0-priority-0] org.apache.hadoop.security.UserGroupInformation - PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:665)
2016-05-31T02:54:09,302 DEBUG [task-runner-0-priority-0] org.apache.hadoop.security.UserGroupInformation - PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
2016-05-31T02:54:09,302 DEBUG [task-runner-0-priority-0] org.apache.hadoop.security.UserGroupInformation - PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
2016-05-31T02:54:09,302 DEBUG [task-runner-0-priority-0] org.apache.hadoop.security.UserGroupInformation - PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:665)
2016-05-31T02:54:09,302 DEBUG [task-runner-0-priority-0] org.apache.hadoop.security.UserGroupInformation - PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
2016-05-31T02:54:09,303 DEBUG [task-runner-0-priority-0] org.apache.hadoop.security.UserGroupInformation - PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
2016-05-31T02:54:09,303 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_local1933875771_0002 failed with state FAILED due to: NA
2016-05-31T02:54:09,303 DEBUG [task-runner-0-priority-0] org.apache.hadoop.security.UserGroupInformation - PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getCounters(Job.java:753)
2016-05-31T02:54:09,312 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 33
	File System Counters
		FILE: Number of bytes read=102216878
		FILE: Number of bytes written=51512241
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
	Map-Reduce Framework
		Map input records=39244
		Map output records=39244
		Map output bytes=16736001
		Map output materialized bytes=16892983
		Input split bytes=316
		Combine input records=0
		Combine output records=0
		Reduce input groups=1
		Reduce shuffle bytes=16892983
		Reduce input records=39244
		Reduce output records=0
		Spilled Records=78488
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=1023
		CPU time spent (ms)=0
		Physical memory (bytes) snapshot=0
		Virtual memory (bytes) snapshot=0
		Total committed heap usage (bytes)=400408576
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters
		Bytes Read=0
	File Output Format Counters
		Bytes Written=8
2016-05-31T02:54:09,312 DEBUG [task-runner-0-priority-0] org.apache.hadoop.security.UserGroupInformation - PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
2016-05-31T02:54:09,312 DEBUG [task-runner-0-priority-0] org.apache.hadoop.security.UserGroupInformation - PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getCounters(Job.java:753)
2016-05-31T02:54:09,321 INFO [task-runner-0-priority-0] io.druid.indexer.JobHelper - Deleting path[var/druid/hadoop-tmp/wikiticker/2016-05-31T025323.751Z/f594dcc07af641d1b6f625d1f7bf80ef]
2016-05-31T02:54:09,347 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_wikiticker_2016-05-31T02:53:23.688Z, type=index_hadoop, dataSource=wikiticker}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:160) ~[druid-indexing-service-0.9.0.jar:0.9.0]
	at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:208) ~[druid-indexing-service-0.9.0.jar:0.9.0]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:338) [druid-indexing-service-0.9.0.jar:0.9.0]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:318) [druid-indexing-service-0.9.0.jar:0.9.0]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_11]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_11]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_11]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_11]
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_11]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_11]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_11]
	at java.lang.reflect.Method.invoke(Method.java:483) ~[?:1.8.0_11]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]
	... 7 more
Caused by: com.metamx.common.ISE: Job[class io.druid.indexer.IndexGeneratorJob] failed!
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:343) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
	at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:94) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:261) ~[druid-indexing-service-0.9.0.jar:0.9.0]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_11]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_11]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_11]
	at java.lang.reflect.Method.invoke(Method.java:483) ~[?:1.8.0_11]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]
	... 7 more
2016-05-31T02:54:09,368 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_wikiticker_2016-05-31T02:53:23.688Z",
  "status" : "FAILED",
  "duration" : 31123
}
2016-05-31T02:54:09,381 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.coordination.AbstractDataSegmentAnnouncer.stop()] on object[io.druid.server.coordination.BatchDataSegmentAnnouncer@66a53104].
2016-05-31T02:54:09,381 INFO [main] io.druid.server.coordination.AbstractDataSegmentAnnouncer - Stopping class io.druid.server.coordination.BatchDataSegmentAnnouncer with config[io.druid.server.initialization.ZkPathsConfig@22e2266d]
2016-05-31T02:54:09,381 INFO [main] io.druid.curator.announcement.Announcer - unannouncing [/druid/announcements/localhost:8100]
2016-05-31T02:54:09,828 DEBUG [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn - Got notification sessionid:0x155047ed5de0011
2016-05-31T02:54:09,828 DEBUG [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn - Got WatchedEvent state:SyncConnected type:NodeDeleted path:/druid/announcements/localhost:8100 for sessionid 0x155047ed5de0011
2016-05-31T02:54:09,828 DEBUG [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn - Got notification sessionid:0x155047ed5de0011
2016-05-31T02:54:09,828 DEBUG [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn - Got WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/druid/announcements for sessionid 0x155047ed5de0011
2016-05-31T02:54:09,830 DEBUG [Announcer-0] io.druid.curator.announcement.Announcer - Path[/druid/announcements] got event[PathChildrenCacheEvent{type=CHILD_REMOVED, data=ChildData{path='/druid/announcements/localhost:8100', stat=1113191,1113191,1464663221579,1464663221579,0,0,0,95987909858820113,111,0,1113191
, data=null}}]

Hi xie cui,

Can you check the existence of that temp dir, and also check if you’re running into permissions issues?

I also noticed that your temp dir is a relative path, are you overriding java.io.tmpdir?

Thanks,

Jon

Hi Jon,
I’m getting this error too, I’m trying to run the quickstart wiki example on a CentOS machine.

Hi Jon, I ran into the same issue. I once successfully ingested data and one day I found this issue, after then, I can no longer ingest any more data. It always reported this error:

2016-12-26T05:20:07,386 WARN [Thread-59] org.apache.hadoop.mapred.LocalJobRunner - job_local1415317447_0002
java.lang.Exception: java.io.IOException: Unable to delete file: var/tmp/base1469712321305818889flush/merged/v8-tmp/.nfs0000000007aa216200000e1a
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.3.0.jar:?]
Caused by: java.io.IOException: Unable to delete file: var/tmp/base1469712321305818889flush/merged/v8-tmp/.nfs0000000007aa216200000e1a
	at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279) ~[commons-io-2.4.jar:2.4]
	at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) ~[commons-io-2.4.jar:2.4]
	at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) ~[commons-io-2.4.jar:2.4]

B.T.W, all the configures are as default.

在 2016年6月1日星期三 UTC+8上午8:57:08,Jonathan Wei写道: