Very slow druid ingestion time

Hello all,

My setup is as follows:

Druid Cluster 0.15.1 version, deployed on private Cloud,

Hadoop EMR aws release 5.15.0
Cluster consists of: 16 instances of
4 vCore, 16 GiB memory, EBS only storage

EBS Storage:32 GiB
I am reading pre-aggregated data of one hour, (size is 6.6 GB), so rollup in ingestion spec is false.
1000 #files. Format is ORC.

job properties is set as follows:

“tuningConfig”: {
“type”: “hadoop”,
“partitionsSpec”: {
“type”: “hashed”,
“targetPartitionSize”: 10000000
},
“maxRowsInMemory”: 500000,
“buildV9Directly”: true,
“jobProperties”: {
“mapreduce.job.user.classpath.first”: “true”,
“mapreduce.job.classloader.system.classes” : “java., javax.accessibility., javax.activation., javax.activity., javax.annotation., javax.annotation.processing., javax.crypto., javax.imageio., javax.jws., javax.lang.model., -javax.management.j2ee., javax.management., javax.naming., javax.net., javax.print., javax.rmi., javax.script., -javax.security.auth.message., javax.security.auth., javax.security.cert., javax.security.sasl., javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., org.xml.sax., org.apache.commons.logging., org.apache.log4j., -org.apache.hadoop.hbase., -org.apache.hadoop.hive., org.apache.hadoop.”,
“mapreduce.map.memory.mb”: “4096”,
“mapreduce.map.java.opts”: “-server -Xms2048m -Xmx2048m -Duser.timezone=UTC -Dfile.encoding=UTF-8”,
“mapreduce.reduce.memory.mb”: “12288”,
“mapreduce.reduce.java.opts”: “-server -Xms11264m -Xmx11264m -Duser.timezone=UTC -Dfile.encoding=UTF-8”,
“mapreduce.task.timeout”: 1800000,
“mapreduce.map.output.compress”: “true”,
“mapreduce.map.output.compress.codec”: “org.apache.hadoop.io.compress.SnappyCodec”
}
},

``

I would expect my map reduce jobs on hadoop to finish very quickly, however it takes about 1.5 hour for the index-generator to finish.

I believe this is troublesome, is there something wrong with my setup i am not seeing?

Thanks.

Data are located in S3, same region as Hadoop EMR

Hi Stelios, Can you post your full ingestion task log , and the yarn application log here?

Thanks

Hello Ming,

I can only find the ingestion task log that druid shows me in the console, i ssh’ed on hadoop master, and in
/var/log/hadoop-yarn/

``

but i cant find any logs specific to the successfull job.

No luck with
yarn logs -applicationId

``

command either.

Is there something else i could try to provide you with more info?

ingestion_log.txt (11.2 KB)

The only content i am seeing when i run the command:

yarn logs -appOwner druid -applicationId application_1566997258299_0085

``

is

Container: container_1566997258299_0085_01_000233 on ip-172-31-45-160.ec2.internal_8041

Thanks Stelios. Unfortunately this YARN log does not contain much info on why the ingestion was slow.

The ingestion task log at console shall be saved at a location per your common.runtime.properties config. If it’s not saved, I think the higher priority problem is to get the ingestion task log problem fixed, so we can have the proper log about the ingestion functionalities and performances.