Dynamic Peon Configuration

Hi -

I have couple of questions on peon configuration(s). I using Imply distribution with 2 data nodes with the following configuration -

r3.2xlarge

  • 8 vCPUs
  • 61 GB RAM
  • 160 GB SSD storage

Because I have historical node running in the same server as the middle managers, I set my druid.worker.capacity = 6 [Number of available processors - 1] on each server. All my datasources have a segmentGranularity of 15 minutes and windowPeriod is 10 minutes. So essentially, each worker holding a segment will wait for 25 mins before it will finalize/persist the segment.

So at any give time, I can not have more than 12 segments in the Running mode.

Here is my middle manager configuration -

druid.service=druid/middlemanager
druid.port=8091

Number of tasks per middleManager

druid.worker.capacity=6

Task launch parameters

druid.indexer.runner.javaOpts=-server -Xmx3g -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
druid.indexer.task.baseTaskDir=var/druid/task
druid.indexer.task.restoreTasksOnRestart=true

HTTP server threads

druid.server.http.numThreads=50

Processing threads and buffers

druid.processing.buffer.sizeBytes=536870912
druid.processing.numThreads=2

Hadoop indexing

druid.indexer.task.hadoopWorkingPath=var/druid/hadoop-tmp
druid.indexer.task.defaultHadoopCoordinates=[“org.apache.hadoop:hadoop-client:2.3.0”]

Store task logs in deep storage

druid.indexer.logs.type=s3
druid.indexer.logs.s3Bucket=pclndruid
druid.indexer.logs.s3Prefix=prod/logs/v1

#monitors
druid.monitoring.monitors=[“com.metamx.metrics.JvmMonitor”]

``

Here are my questions -

  1. Is there a way to be able to say, use less memory for certain workers, so I can increase the number of workers depending on the data I am going to ingest into each segment. I have topics where certain datasources only generate like 6k rows per day. And I have some datasources that generate data 100k rows per second that are segmented into 15 mins interval.
  2. What happens if I increase the worker capacity not considering the number of processors?
  3. How much memory does each peon task require? If the segment contains lot of data, does having less memory impact the way data in the segments are aggregated?
  4. Does peons due just the jvm memory or use disk space to aggregate and finalize segments?
  5. How can I have a strategy to isolate worker threads based on their load?

Any advice? Let me know. Thanks!

You might be interested in https://groups.google.com/d/msg/druid-development/1I3CmxlOipM/e3-SpWqG170J which is a way of launching tasks that I’ve been working on. It is currently taking a back-burner to QTL though (which should hopefully be mostly complete for 0.9.1).

Pretty much all of your questions are “How can I better define resource expectations per task?”

In current /master there are two ways to do this: 1) customize the amount of resource slots each task takes. 2) Use the withAffinity scheduling to change middle managers per datasource.

I don’t think either one of these quite accomplishes what you are wanting to accomplish though, which is why Task Tiering as a way of organizing resource expectations is on my list of things to add/improve.

Thanks Allen!

The design looks promising and answers my questions.

But in the interim, can you tell me if there is a way to direct some of the datasources to specific middle manager / peons ?

Thanks!

Can you also please answer # 3 and # 4?

Thanks!

Each peon is going to use its heap to merge query results for certain query types, and to store up to maxRowsInMemory * (2 + maxPendingPersists) aggregated rows for indexing. It’ll write everything else to disk and then memory-map that. And like historical nodes, they each have a processing thread pool with one processing buffer per pool.

For best performance you usually want to have enough memory set aside per peon for its heap, its processing buffers, and for the OS to be able to keep all/most of its memory-mapped data in the page cache.

Hi Jagdeesh,
to Answer your Question “But in the interim, can you tell me if there is a way to direct some of the datasources to specific middle manager / peons ?”

Yes, there is a affinity based worker selection strategy that you can use at present to assign datasources to specific worker.

refer - “Fill capacity with affinity” here - http://druid.io/docs/latest/configuration/indexing-service.html

Good to know. Will the submitted fill affinity be stored as a rule or is this a one time thing while the nodes are up and running.

Thanks!

The worker config will be persisted in druid metadata store in a config table and will be read from there later on overlord restarts.