Is anyone using druid to access s3 bucket with IAM role instead of hard-coded keys?

This pull was merged (https://github.com/druid-io/druid/pull/837) so it should be possible but it’s not working for me. There must be more to it than just putting s3://bucket/ for the ‘paths:’ in the inputSpec.

Hi Scott,

I haven’t tried this myself, but looking at the code, the InstanceProfileCredentialsProvider is last in the provider chain

I think you’ll also have to ensure that there isn’t any existing accessKey/secretKey configuration used by the providers earlier in the chain so that it falls back to using IAM.

Thanks,

Jon

Hi Jonathan,

This is a brand new ec2 instance running a quickstart druid. No keys anywhere in env or otherwise. The IAM role is attached but druid keep complaining about credentials until I put keys in:

“tuningConfig” : {

“type” : “hadoop”,

“partitionsSpec” : {

“type” : “hashed”,

“targetPartitionSize” : 5000000

},

“jobProperties” : {

“fs.s3n.awsAccessKeyId” : “id”,

“fs.s3n.awsSecretAccessKey” : “secret”

}

}

Hi Scott,

The area affected by PR #387 controls the credentials for Druid’s deep storage, but it turns out that Hadoop uses its own credential configuration.

I think the issue you’re seeing may be that Hadoop doesn’t support IAM when using S3N, the roles are only supported when using S3A with Hadoop 2.7:

https://issues.apache.org/jira/browse/HADOOP-9384

https://issues.apache.org/jira/browse/HADOOP-10400

https://wiki.apache.org/hadoop/AmazonS3

Thanks,

Jon

wow, i thought i was confused before…

if i just change my index task ioConfig path from s3n://… to s3a://… it fails with:

Caused by: java.lang.RuntimeException: java.io.IOException: No FileSystem for scheme: s3a

Yeah, s3a isn’t really supported in druid right now: https://github.com/druid-io/druid/issues/2748