I am trying to setup a Druid cluster in AWS r3.2xlarge with 60GB memory.
My main use case is to use Tranquility standalone core API to send events to the ‘overlord’ node using https://github.com/druid-io/tranquility/blob/master/docs/core.md
and then try out some sample filter+groupby queries.
What would be the best assignment of the 60GB memory amongst the various nodes?
I am thinking:
Do you think I need to give more memory to Overlord and MiddleManager since Tranquility directly connects to Overlord?
Also middleManager doesn’t need that much of memory. Someone said it runs with 64MB. Its peons (workers) need 2G-3G memory though depends on the number of threads.
This is a experimental setup, right? Why do you want to run all roles in a single box? Production setup needs multiple boxes for redundancy and distributed queries for the performance. And if it’s experimental, I suggest that you run Imply suite from http://imply.io and begin with the default configuration. It runs fine in my 16GB laptop. Then tweak number of threads and JVM heap.
By the way, you still need to run Zookeeper, RDBMS (mysql, postgres, derby) for metadata.
Thanks B-Slim. The information is helpful.
This is purely experimental; the aim is to be comfortable with the system and how to load data/run queries.
We plan to run in testing cluster next with individual boxes assigned per node type. Will definitely try the Imply suite and keep you posted.