I’m looking at installing druid on a 100 node cluster. However before I proceed, I wished to enquire if there is a way to automatically provision the cluster (something on the lines of how Ambari does it for hadoop) or the only option is to do a manual install on each node with the corresponding configuration?

There is no builtin cloud platform for druid. There are as many different ways for distributing binaries and configurations as there are startups in silicon valley.

Internally we package up a tarball and distribute that around using a rudimentary cluster distribution system.

I’ve heard of folks using Marathon to distribute their druid packages around as well. You’ll have to figure out what kind of cluster resource management system works for your group’s cluster management.