How to distribute the indexing reduce step to improve performance

Hi everyone,

I’m working on a toy Druid cluster to test its functionalities.

Currently I’m trying to ingest a day of data efficiently with hadoop map-red indexer.

  • Input format: orc
  • tuningConfig:
"tuningConfig" : {
  "type" : "hadoop",
  "partitionsSpec" : {
    "type" : "hashed",
    "targetPartitionSize" : 1000000
  },
  "jobProperties" : {
    "dfs.client.use.datanode.hostname" : "true",
    "dfs.datanode.use.datanode.hostname" : "true",
    "mapreduce.map.java.opts" : "-Xmx10g -Duser.timezone=UTC -Dfile.encoding=UTF-8",
    "mapreduce.job.user.classpath.first" : "true",
    "mapreduce.reduce.java.opts" : "-Xmx12g -Duser.timezone=UTC -Dfile.encoding=UTF-8",
    "mapreduce.map.memory.mb" : 16384,
    "mapreduce.reduce.memory.mb" : 16384,
    "mapreduce.job.classloader": "true",
    "mapreduce.map.tasks": 80,
    "mapreduce.reduce.tasks": 55,
    "mapreduce.job.classloader.system.classes": "java., javax.accessibility., javax.activation., javax.activity., javax.annotation., javax.annotation.processing., javax.crypto., javax.imageio., javax.jws., javax.lang.model., -javax.management.j2ee., javax.management., javax.naming., javax.net., javax.print., javax.rmi., javax.script., -javax.security.auth.message., javax.security.auth., javax.security.cert., javax.security.sasl., javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., org.xml.sax., org.apache.commons.logging., org.apache.log4j., -org.apache.hadoop.hbase., -org.apache.hadoop.hive., org.apache.hadoop., core-default.xml, hdfs-default.xml, mapred-default.xml, yarn-default.xml",
    "mapreduce.job.queuename": "queue"
  }
}
  • Rollup is activated with an HLL sketch and a count as only metrics
  • 9 dimensions with variable cardinality (all below 400)
  • Toy cluster spec:
HOST SPEC DEPLOYED
1.lan 4 CPU, 16GB Overlord + ZK
2.lan 4 CPU, 16GB Broker
3.lan 16 CPU, 32GB, 256GB Historical, MiddleManager
4.lan 16 CPU, 32GB, 256GB Historical, MiddleManager
  • Version 0.22.1
  • 1 day ~300GB of data with a roll up factor of 400x

Currently an ingestion takes from 4 to 5 hrs and the clear bottleneck is the last reduce in the last job (~80% of total time) when the segments are created. From my understanding the only way to further distribute this step would be to reduce the partition size, correct? Does anybody know how this can be sped up and what could be the factors that make it so slow?

Hey @tompere !

So this article is from an old colleague of mine from back in 2019 who spent a tonne of time tuning hadoop-based ingestion – you may find some nuggets in here?

I note that native hadoop ingestion is currently out-of-favour – have you given the HDFS inputSource in native parallel batch ingestion a go?

1 Like

Thank you for you answer. I’ll try to use the native ingestion.
It was just more convenient to use Hadoop for me since I had a cluster at my disposal.

I’ve already gone through the article, it was helpful but still the core problem remains: the last reduce operation is computationally intensive and, from my findings, can be distributed only by tuning "targetPartitionSize" or "numShards", forcefully reducing segments size.

1 Like