Running Druid in production using kubernetes


I am currently using docker-druid image for demo purposes. I have deployed this on a Kubernetes cluster on IBM cloud. My application workflow works fine in which events are persisted to druid. However, In the kubernetes environment the pods keep going down and comes back up automatically (handled by kubernetes) resulting in loss of data. This could be tacked by using HDFS for storage. Not sure how zookeeper needs to be handled.

I learned that docker-druid image isn’t suitable for production. So is the imply docker image as per the official Imply documentation:

“This is a Dockerized version of Imply ( designed for easily running the quickstart (single-machine, non-clustered). It isn’t currently supported for use in production; for that, we still recommend using the downloadable distribution”

Given this, can we deploy druid reliably using docker on a kubernetes cluster? Are there any best practices for doing this?



Would making the following things external help in making sure that there is no data loss if druid containers deployed onto kubernetes goes down?

  1. Use external zookeeper cluster

  2. Use external meta data store like external mysql

  3. Use NFS or HDFS as storage type

As of now i am using defaults provided by docker-druid i.e. zookeeper and mysql that comes bundled with the image and storage type as “local” ( i think so but i didn’t find the configuration in supervisor.conf). Due to this every time the druid pod goes down the data would be lost. I understand that atleast the storage should not be local but going by couple of other threads in this forum i think the above mentioned dependencies should be made external. Please correct in case the above assumptions are incorrect.