How to improve "Tranquility" ingestor performance.

Hi:
I have two ec2 machines acting as middle manager, the type of machines are r3.4xlarge(16 cores, 122 GB memory).

But when I ingest real time data through spark streaming with tranquility, it cost so long time to ingest the data.

From the UI of overloard node, the peon task run successfully. And cpu and memory usage of middle manager is too low.

Could you help to give some advice how to improve the “Tranquility” ingestor performance?

The configuration of middle manager node is described as below:

druid.host=ip

druid.port=8080

druid.service=druid/prod/middlemanager

Run the overlord in local mode with a single peon to execute tasks

This is not recommended for production.

druid.indexer.queue.startDelay=PT0M

This setting is too small for real production workloads

druid.indexer.runner.javaOpts=-server -Xmx3g -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps

These settings are also too small for real production workloads

Please see our recommended production settings in the docs (http://druid.io/docs/latest/Production-Cluster-Configuration.html)

druid.indexer.task.baseTaskDir=/data/persistent/task

druid.indexer.logs.type=s3

druid.indexer.logs.s3Bucket=bucket

druid.indexer.logs.s3Prefix=druid/logs

druid.indexer.fork.property.druid.monitoring.monitors=[“com.metamx.metrics.JvmMonitor”]

druid.indexer.fork.property.druid.processing.buffer.sizeBytes=536870912

druid.indexer.fork.property.druid.processing.numThreads=2

druid.indexer.fork.property.druid.segmentCache.locations=[{“path”: “/data/persistent/zk_druid”, “maxSize”: 0}]

druid.indexer.fork.property.druid.server.http.numThreads=50

druid.indexer.fork.property.druid.storage.baseKey=druid/index

druid.indexer.fork.property.druid.storage.bucket=bucket

druid.indexer.fork.property.druid.storage.type=s3

druid.worker.capacity=8

druid.worker.ip=ip

druid.worker.version=0

And the Beam I created is:

DruidBeams
 .builder((eventMap: scala.collection.mutable.Map[String, Any]) => new DateTime(eventMap(FieldNames.timestamp).asInstanceOf[String]))
 .curator(curator)
 .discoveryPath(discoveryPath)
 .location(DruidLocation.create(indexService, dataSource))
 .rollup(DruidRollup(SpecificDruidDimensions(dimensions), aggregators :+ new CountAggregatorFactory("count"), QueryGranularity.MINUTE))
 .tuning(
  ClusteredBeamTuning(
   segmentGranularity = Granularity.FIVE_MINUTE,
   windowPeriod = new Period(typesafeConfig.getString("PT2m")),
   partitions = 8,
   replicants = 1
  )
 ).timestampSpec(new TimestampSpec(FieldNames.timestamp, "iso", null))
 //   .beamMergeFn(beams => new HashPartitionBeam(beams.toIndexedSeq))
 .finagleRegistry(finagleRegistry)
 .buildBeam()

One thing to watch out with Spark Streaming is to make sure that you aren’t re-creating objects needlessly for every dstream rdd partition. The best way I know to avoid this is to use singletons (scala “object”). See the sample code here for an example: https://github.com/druid-io/tranquility/blob/master/docs/spark.md

Generally if things are well set up, you should expect to get 5–50k events/sec ingest rate per Druid partition (mostly depending on the size of your events and the dimensions/metrics you have set up in Druid).