How to configure logging (e.g. change logging level) for indexer hadoop job container

druid release: 0.10.0

Hello Everybody!

The index_hadoop task on yarn nodemanager container (datanode) logs everything to the stdout and yarn-nodemanager saves this output in some configurable directory (yarn-site.xml:yarn.nodemanager.log-dirs) to the stdout file. Now the problem is that I can not control what is being output by druid indexer inside such a job container.

This line particularly: https://github.com/druid-io/druid/blob/master/indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfig.java#L580 is creating a LOT of data in the stdout in each of the map containers’ directories.

How can I set the log level to WARN? Perhaps I could somehow “submit” log4j2.xml for such job to the datanode, but how and where to put it? Configuring log level for yarn-nodemanager or mapreduce (using container-log4j.properties) didn’t help.

In the submitted job I use separate classloader (maybe this is important):

“mapreduce.job.classloader”: “true”,

“mapreduce.job.classloader.system.classes”: “-javax.validation.,java.,javax.,org.apache.commons.logging.,org.apache.log4j.,org.apache.hadoop.”

The workaround was to add

to druid/common/src/main/resources/log4j2.xml

update log4j2.xml in druid-common-0.10.0.jar and upload this jar to hdfs:/tmp/druid-indexing/classpath/

But is there another way?

But is there another way?

I’d like to bump this thread as I’m having the same question.
This log output is really causing us production issues as it is filling up the disks on our core nodes and is making them crash.
I know that this has been fixed in the latest Druid release but what would be a quick and easy way to spin up a Druid Hadoop Indexer cluster with a custom log4j setup?
thanks

Hi Sascha,

We also hit this issue and I dug into it to find out that since all the hadoop logging settings are for log4j version 1, they don’t have any effect on the druid map-reduce job. For instance, creating a container-log4j.properties, which seemed like it should work, actually had no effect. The only workaround we found to work was the one mentioned above of inserting a modified log4j2.xml into the druid-common jar itself.

Best,

Morri