Having trouble with data ingestion in Druid datastore on Google Cloud Compute Engine


I have setup a google cloud computing engine with 4 vCPUs (15 GB RAM), and followed the instructions given in Druid quick start guide.

  • Ensured Java 8 is installed.
  • Installed Zookeeper and started the process.
  • Installed Druid and started the process.
  • Started the five Druid processes in different terminal windows and ensured the log messages are printed out for each service that starts up.
    Once all the services up, then started the data load. Submitted an ingestion task pointing to the file, POST’ed it to Druid in a new terminal window from the druid-0.9.2 directory “curl -X ‘POST’ -H ‘Content-Type:application/json’ -d @quickstart/wikiticker-index.json localhost:8090/druid/indexer/v1/task”

Got an ID confirming the task submission. {“task”:“index_hadoop_wikipedia_2013-10-09T21:30:32.802Z”}

Tried the “curl http://localhost:8090/console.html” to check the task status, even after hours of refreshing the console it returns “Loading the data, …it may take few mins”. I hope it’s supposed to complete the data ingestion in few mins and return “SUCCESS” status for the task, but I’m clueless and stuck for two days. Can someone please help me with the druid data loading procedures. I would really appreciate it.


Can you post your task and overlord logs?