Hey all, I’m a bit new to using Druid and I’ve run into a problem when trying to use the hadoop indexer to ingest a bunch of JSON data (about 150GB worth of data, approx). This was either the third or fourth MapReduce task that the indexer started (I’m not totally sure it should have run that many tasks?) and toward the end I got an error like this:
Error: java.io.IOException: Unable to create temporary file, /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1453993122873_0003/container_1453993122873_0003_01_000532/tmp/filePeon4308725064636988861 Blair O’Neal Host New Season of Sexiest Shots - Golf Digest Videos - The Scene#/watch/golfdigest/the-sexiest-shots-in-golf-sports-illustrated-s-kelly-rohrbach-blair-o-neal-host-new-season-of-sexiest-shots?mbid.header
Is there a setting in the specfile where I can choose which variable it uses to create temp file names? I believe this failed because it has a ’ in it (though perhaps it was just space? That actually just occurred to me).
In addition, can anyone tell me how many MR jobs I should expect the indexer to run before completing?
Thanks for all the help