Batch ingestion error at version(0.9.2)

Hi,
in 0.9.2 version, I meet some problem with batch ingestion(hadoop).My hadoop version is 2.2.0.

First task failed with error ,may be some conflict happen.

then try with set at http://druid.io/docs/0.9.2/operations/other-hadoop.html.

when i set “mapreduce.job.classloader”: “true”,the error is

2016-12-11 17:39:20,613 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster

java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.hdfs.web.HftpFileSystem not found

at java.util.ServiceLoader.fail(ServiceLoader.java:231)

at java.util.ServiceLoader.access$300(ServiceLoader.java:181)

at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:365)

at java.util.ServiceLoader$1.next(ServiceLoader.java:445)

at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2404)

at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2415)

at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2432)

at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)

at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2471)

at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2453)

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166)

at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.getFileSystem(MRAppMaster.java:534)

at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:295)

at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)

at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1526)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1743)

at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1523)

at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1456)

2016-12-11 17:39:20,618 INFO [Thread-1] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster received a signal. Signaling RMCommunicator and JobHistoryEventHandler.

``

and I try mapreduce.job.user.classpath.first = true and mapreduce.job.classloader = false, the

  • worse is container can’t run.
  • could somebody help me? Thanks.

My hadoop version 2.2.0 base CDH3

在 2016年12月11日星期日 UTC+8下午7:11:47,Zhenyuan Gao写道:

Hi:
I have similar problem with “mapreduce.job.classloader”: “true”, config, but the error log is:
2016-12-12 11:14:49,752 INFO [main] org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class io.druid.indexer.IndexGeneratorJob$IndexGeneratorOutputFormat not found
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class io.druid.indexer.IndexGeneratorJob$IndexGeneratorOutputFormat not found
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:467)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:368)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1477)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1474)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class io.druid.indexer.IndexGeneratorJob$IndexGeneratorOutputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2047)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:232)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:463)
… 8 more
Caused by: java.lang.ClassNotFoundException: Class io.druid.indexer.IndexGeneratorJob$IndexGeneratorOutputFormat not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1953)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2045)
… 10 more

在 2016年12月11日星期日 UTC+8下午7:11:47,Zhenyuan Gao写道:

my hadoop versioin is Hadoop 2.5.0-cdh5.2.0

在 2016年12月12日星期一 UTC+8上午11:20:20,Mark写道:

Hi this exception is not a hadoop druid dependency clash, it is more likely to be issue with the working directory.

are you using snapshot jars ?

Hi:
I use both imply-2.0.0.tar.gz from https://imply.io/ and my own build from druid-0.9.2 branch with jdk7, all of them has the same problem.

在 2016年12月12日星期一 UTC+8下午11:02:04,Slim Bouguerra写道:

can you make sure to clean the temporary working directory used by druid to store Jars on the HDFS. Seems like there 2 different version of druid code base.

yes, BTW, does the temporary working dir is druid.indexer.task.hadoopWorkingPath ?
在 2016年12月13日星期二 UTC+8上午10:36:25,Slim Bouguerra写道:

Hi:
i have cleaned up temp class dir, and collect all the details in the attachment, pls, check out.

thanks.
在 2016年12月13日星期二 UTC+8上午11:05:42,Mark写道:

druid_logs.zip (40.8 KB)

@Mark Have you solved this problem? I have similar problem in apache hadoop version 2.4.1…
But NP in 2.7.2

I do not know why.

I found the reason for this problem, it’s a hadoop bug. For more details, pls visit https://issues.apache.org/jira/browse/MAPREDUCE-5957

在 2017年5月25日星期四 UTC+8下午6:57:11,Gary Huang写道: