Apache Druid Data Nodes Memory Issues

Throughput - 1 Million saves per second
Hardware -
16 vCPU
48GB RAM
800GB SSD

Druid Data Node Configs-

MIDDLE MANAGER CONFIGS

druid.service=druid/middleManager

druid.plaintextPort=8091

Number of tasks per middleManager

druid.worker.capacity=15

Task launch parameters

druid.indexer.runner.javaOpts=-server -Xms256m -Xmx3g -XX:MaxDirectMemorySize=10g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -XX:+ExitOnOutOfMemoryError -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

druid.indexer.task.baseTaskDir=var/druid/task

HTTP server threads

druid.server.http.numThreads=50

Processing threads and buffers on Peons

druid.indexer.fork.property.druid.processing.numMergeBuffers=2

druid.indexer.fork.property.druid.processing.buffer.sizeBytes=100000000

druid.indexer.fork.property.druid.processing.numThreads=1

Hadoop indexing

druid.indexer.task.hadoopWorkingPath=var/druid/hadoop-tmp

HISTORICAL CONFIGS

druid.service=druid/historical

druid.plaintextPort=8083

HTTP server threads

druid.server.http.numThreads=60

Processing threads and buffers

druid.processing.buffer.sizeBytes=500000000

druid.processing.numMergeBuffers=60

druid.processing.numThreads=1

druid.processing.tmpDir=var/druid/processing

druid.broker.http.numConnections=25

Segment storage

druid.segmentCache.locations=[{“path”:“var/druid/segment-cache”,“maxSize”:300000000000}]

druid.server.maxSize=300000000000

Query cache

druid.historical.cache.useCache=true

druid.historical.cache.populateCache=true

druid.cache.type=caffeine

druid.cache.sizeInBytes=256000000

Issue–
RAM runs out with 1 hour of starting the load test. Testing with 1 - 1.5 Million saves per second
3 Zookeepers
3 Kafkas
3 Druid Data Nodes, 16 Tasks

Buffer is filled up by the mmaps.

Welcome @omohammed!

Can you please share more specifics? Are you seeing OOMs?

Best,

Mark

what are saves? messages ingested from a stream?
If so, how many partitions in the kafka topic do you have? Can you share the ingestion spec?

That is some serious throughput. Perhaps it is better to test with lower throughput, tune the ingestion spec, and then scale both message throughput, MM capacity and task count in the ingestion spec once you’ve determined how much each task can handle.

Thank You for the reply. Yes saves are messages/second. Number of Partitions are 500.

Using 2 SSD’s, tried with the problem starts after 500k messages.

Can you please suggest number of kafka brokers and number of druid data nodes and tasks to run for 1million.

Size of each message could be 30kb max.

It is hard to say with any certainty, without some benchmarking and testing the limits. 500 partitions sounds very large for a single topic.
Here’s an interesting conversion on the kafka side of that equation:

Additionally, take a look at these benchmarks for the Druid side of the ingestion: Farewell Lambda Architectures: Exactly-Once Streaming Ingestion in Druid - Imply

Your setup needs will depend significantly on your data demographics. The results table presented at the bottom of the article should give you some idea of what each ingestion task can achieve with certain data demographics and therefore how many of them you would need to achieve your throughput. I think you’ll find this paragraph in that blog post particularly useful:

In our testing, we were able to achieve a sustained aggregate ingestion rate of **3.3M events/sec** on a single r3.8xlarge instance when indexing simple events with very high roll-up. When ingesting more complicated data (10 low cardinality dimensions + 1 high cardinality dimension) which required more processing power for index generation and frequent spills to disk, a single instance was able to handle just over 600k events/sec.

I think you will need to test with your own data demographics in order to tune and design the ingestion specifically for your deployment.

Let us know how it goes.

Sergio