Druid 0.11.0 : Error when indexing - java.io.IOException: No such file or directory

Hi

In single node configuration, when indexing a new datasource from a csv file I get the error below. This seems to happen in the reduce phase after all the map phases have completed successfully. The exception does not specify the file or directory that it can’t find, so I’m stuck.

All the required directories exist and I have run the init script.

Any ideas?

Thank You

  "ioConfig" : {
      "type" : "hadoop",
      "inputSpec" : {
        "type" : "static",
        "paths" : "/var/tmp/sme_30062017_druid.csv"
      },
      "metadataUpdateSpec" : null,
      "segmentOutputPath" : "file:/var/druid/segments/"
    },
    "tuningConfig" : {
      "type" : "hadoop",
      "workingPath" : "/var/druid/hadoop-tmp",
      "version" : "2018-02-02T09:28:13.787Z",
      "partitionsSpec" : {
        "type" : "hashed",
        "targetPartitionSize" : 1000000,
        "maxPartitionSize" : 1500000,
        "assumeGrouped" : false,
        "numShards" : -1,
        "partitionDimensions" : [ ]
      },
      "shardSpecs" : {
        "946684800000" : [ {
          "actualSpec" : {
            "type" : "none"
          },
          "shardNum" : 0
        } ]
      },
      "indexSpec" : {
        "bitmap" : {
          "type" : "concise"
        },
        "dimensionCompression" : "lz4",
        "metricCompression" : "lz4",
        "longEncoding" : "longs"
      },
      "maxRowsInMemory" : 75000,
      "leaveIntermediate" : false,
      "cleanupOnFailure" : true,
      "overwriteFiles" : false,
      "ignoreInvalidRows" : false,
      "jobProperties" : { },
      "combineText" : false,
      "useCombiner" : false,
      "buildV9Directly" : true,
      "numBackgroundPersistThreads" : 0,
      "forceExtendableShardSpecs" : false,
      "useExplicitVersion" : false,
      "allowedHadoopPrefix" : [ ]
    },
    "uniqueId" : "bcfe6b425a2345ae81029b021ee2b1e8"
  }
}
2018-02-02T09:29:02,001 INFO [Thread-69] org.apache.hadoop.mapred.LocalJobRunner - reduce task executor complete.
2018-02-02T09:29:02,114 WARN [Thread-69] org.apache.hadoop.mapred.LocalJobRunner - job_local1904673355_0002
java.lang.Exception: java.io.IOException: No such file or directory
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.7.3.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.7.3.jar:?]
Caused by: java.io.IOException: No such file or directory
	at java.io.UnixFileSystem.createFileExclusively(Native Method) ~[?:1.8.0_141]
	at java.io.File.createTempFile(File.java:2024) ~[?:1.8.0_141]
	at java.io.File.createTempFile(File.java:2070) ~[?:1.8.0_141]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:568) ~[druid-indexing-hadoop-0.11.0.jar:0.11.0]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:489) ~[druid-indexing-hadoop-0.11.0.jar:0.11.0]
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) ~[hadoop-mapreduce-client-common-2.7.3.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_141]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_141]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_141]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_141]
	at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_141]
2018-02-02T09:29:03,027 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_local1904673355_0002 failed with state FAILED due to: NA
2018-02-02T09:29:03,038 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 30
	File System Counters
		FILE: Number of bytes read=8651975749
		FILE: Number of bytes written=9162131091
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
	Map-Reduce Framework
		Map input records=607779
		Map output records=607779
		Map output bytes=894757280
		Map output materialized bytes=897188450
		Input split bytes=2466
		Combine input records=0
		Combine output records=0
		Reduce input groups=0
		Reduce shuffle bytes=897188450
		Reduce input records=0
		Reduce output records=0
		Spilled Records=1168290
		Shuffled Maps =9
		Failed Shuffles=0
		Merged Map outputs=9
		GC time elapsed (ms)=505
		Total committed heap usage (bytes)=15571353600
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters
		Bytes Read=0
	File Output Format Counters
		Bytes Written=0
2018-02-02T09:29:03,043 INFO [task-runner-0-priority-0] io.druid.indexer.JobHelper - Deleting path[/var/druid/hadoop-tmp/sme_30062017/2018-02-02T092813.787Z_bcfe6b425a2345ae81029b021ee2b1e8]
2018-02-02T09:29:03,056 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_sme_30062017_2018-02-02T09:28:13.782Z, type=index_hadoop, dataSource=sme_30062017}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:218) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:226) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.11.0.jar:0.11.0]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.11.0.jar:0.11.0]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_141]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_141]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_141]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_141]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_141]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_141]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:215) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	... 7 more
Caused by: io.druid.java.util.common.ISE: Job[class io.druid.indexer.IndexGeneratorJob] failed!
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:390) ~[druid-indexing-hadoop-0.11.0.jar:0.11.0]
	at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:95) ~[druid-indexing-hadoop-0.11.0.jar:0.11.0]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:279) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_141]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_141]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_141]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:215) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	... 7 more
2018-02-02T09:29:03,063 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_sme_30062017_2018-02-02T09:28:13.782Z] status changed to [FAILED].
2018-02-02T09:29:03,066 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_sme_30062017_2018-02-02T09:28:13.782Z",
  "status" : "FAILED",
  "duration" : 45956
}
2018-02-02T09:29:03,071 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.listener.announcer.ListenerResourceAnnouncer.stop()] on object[io.druid.query.lookup.LookupResourceListenerAnnouncer@681061d6].

..............................


Found the problem going through the jvm.config and properties files.

Can I ask why all the paths in the config files are relative?

Cheers

They’re relative so the distribution is ‘self contained’ when you run it on a single node out of a tarball. You can make them absolute too if you want.