Cluster configuration issues

Hi everybody !
I’m trying to configure the druid cluster, I followed all the tutorial we can feed on druid.io but I have a question. I’m trying to configure it on virtualBox with one master server, one query server and 2 data server. Does someone knows what is the amount of memory I have to allocate for each server ?
I’m running the cluster of Druid 0.14.2 on centOS, my computer has 8Gb of RAM. I’m asking this because it looks like running the 4 servers asks way more memory

With what I saw the master should need 2200Mo of memory, the query server 9600 and the data server 5300.

In facts there’s something I don’t understand… Does that mean I can’t install the cluster on my computer ?

Thanks for advance

Hi Clem,
The druid services’ memory can be all be tweaked within the jvm.config. Just to get things running in your 8 GB machine, you should be able to tweak down the services to use about 6 GB total. For example the quickstart configs for the Imply distribution uses the following, just to get started.

Broker - 256MB

Coordinator - 128MB

Historical - 256MB

Overlord - 128MB

MiddleManager - 64MB

Router -128MB

You can use the above as rough guide to get the services started in your multinode vm setup. Hope this helps.

Regards,

ROBERT MOLINA

Hello Robert ! Thank you for your help.

There’s something I don’t understand (I’m a beginner, sorry if it seems obvious). What is BROKER - 256 MB ? Because for example in the picture I joined it’s written that it need more or less 9000 MB for launching the query server. What is the link with the memory I’m allocating when configuring my vm ?

Thanks for advance.

Le mer. 26 juin 2019 à 16:41, Robert Molina robert.molina@imply.io a écrit :

Broker - 256MB means, in the demo configuration, the broker will consume the said memory.

Having said that, you talk about “one master server, one query server and 2 data server.”. Basically, in Druid language, it translates to “One Broker, Two Historicals and One server which runs Zookeeper + Overlord + Coordinator + Middle Manager + Router”

Just look up what services are run when druid starts. Best way to learn is what Rob suggested; Just run the quick start setup and observe these things -

  1. What are the different services that are spun up.
  2. How are the memory consumption looking.
  3. What are the configs that rely on vCPUs and Memory (things like processing threads, directmemory, heaps etc)

Once you establish that, you might then want to dwell a little deeper -

  1. What memory you would need to set aside for historicals?
  2. Do you need resilience? What would be the replication factor.
  3. If you have heavy data that is being ingested, how many middle managers would you require? (basically you need to figure out how many workers you would need and figure out the instances needed to accomodate that)
  4. How many brokers you would need to hit the concurrency you require?

Sadly, there is no one answer that its all. You are the master of your cluster and Druid will be happy to oblige to any of your needs :slight_smile:

Hello,

You can install all the server process on a single instance, just scale each one down appropriately, but realize that your query and ingest speeds are effected by the amount of resources available. If you want an all-in-one druid, try using the quickstart from Imply. It has a stock Apache Druid with all the process set up to coexist on a single machine to use for testing on a laptop or single machine. It is only a 30 day trial, but should get you to a point where you can do some simple tests. You also can reverse the process and use it do a more traditional multi-node cluster.

Hello, thank you all for your answers !

By the way I have been trying to configure the cluster with the quickstart configuration. I followed the tutorial for clustering so I guess it should work. Unfortunately I am not able to load data … No execptions are raised …

If you can help me !