Batch ingestion from GS

Is it possible to configure druid to ingest from google cloud storage? What credentials should be used?

Druid should support any filesystem supported by Hadoop, so this might help: https://cloud.google.com/hadoop/google-cloud-storage-connector. You probably need to put the 2.x version of the connector jar on Druid’s classpath.

With that, you could ingest using an actual Hadoop Map/Reduce job, or you could do it in local mode without a Hadoop cluster (Druid would just use the Hadoop local runner).

In my experience this is not true. I have Google Cloud Storage working for druid but for the Hadoop indexing to work I had to modify the switch statement at https://github.com/druid-io/druid/blob/2b35e909853e03298b0eb3cb6c0bbdde22f11eeb/indexing-hadoop/src/main/java/io/druid/indexer/JobHelper.java#L393 to support gs:// schemes (as fallback to the hdfs case).