Hi Ankit, or Druid Folks,
The benchmarking blog post is woefully incomplete for druid newbies like me.
Using druid-0.82-rc1, and the configs provided, I barely managed to get the cluster up and running. I had to comment out the monitoring section druid.monitoring.monitors because I got some sigar library exception. I’m running one broker node which also runs zookeeper, then co-ordinator JVM, and broker JVM. On the compute node, I’m running it as a “historical” JVM. Do I need to run indexer node as well?
The next step is to load the 100GB data. I have downloaded the data and uploaded it to an S3 bucket.
Looking at the task description - https://github.com/druid-io/druid-benchmark/blob/master/lineitem.task.json, Amazon EMR does not support 0.20.205-emr version. Would it work with the latest 4.1.0 EMR version? Finally, where do I provide the info for the EMR endpoints?
I apologize for these newbie/basic questions but I could not find the answers anywhere.
Thanks in advance,