Ingestion performance

Hi Druid Gurus,

We are using our Druid cluster (0.8.0) after a while. Earlier the batch ingestion (index_hadoop) used to be fast, but now it takes ~ 2 hours to ingest data of 2 GB size. Our hadoop environment is version 2.6.0

Looking at the ingest log, the map reduce job took about ten minutes on our hadoop cluster. I think the Indexing is happening on our Hadoop Cluster now. If i remember correctly, this process earlier ran our Druid Cluster. Could this be a problem? Is there any configs that i should check out?



Hi Kasi, did you update the version or change any configs? In general for batch indexing we recommend running the job on an external Hadoop cluster as it should be faster there.