How to hide/mask S3 credentials in indexing request

Hi,

Need help regarding hiding/masking S3 credentials that were submitted in the indexing request.

We are currently passing S3 access key and secret key in the indexing request. But we are having privacy/security issues as this credentials are being printed in the logs. Is there any way we can hide/mask the credentials in the logs. Or is there any other approach for this.

Below is the same indexing template that we are using.

“tuningConfig” : {
“type” : “hadoop”, “jobProperties” : {
“fs.s3.awsAccessKeyId” : “ACCESS_KEY_ID”,
“fs.s3n.awsAccessKeyId” : “ACCESS_KEY_ID”,
“fs.s3.awsSecretAccessKey” : “SECRET_ACCESS_KEY”,
“fs.s3n.awsSecretAccessKey” : “SECRET_ACCESS_KEY”,
“fs.s3.impl” : “org.apache.hadoop.fs.s3native.NativeS3FileSystem”,
“fs.s3n.impl” : “org.apache.hadoop.fs.s3native.NativeS3FileSystem”
}
}

Thanks

Hi,
Here is a quote of a former answer about how you can get rid of your credentials in your spec.

Also, to get rid of credentials in your ingestion spec, you need to use s3a instead of s3n and do the following

Set this property in your Druid config

druid.storage.useS3aSchema=true

Then your spec can contain :

"fs.s3a.awsAccessKeyId": "accesskey", "fs.s3a.awsSecretAccessKey": "secretkey", "fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem", "fs.s3a.server-side-encryption-algorithm": "AES256", "fs.s3.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",

fs.s3.impl is still needed to make it work.

Also, an even more secure way to pass spec without credentials in it :

`
“fs.s3a.impl”: “org.apache.hadoop.fs.s3a.S3AFileSystem”,
“fs.s3a.aws.credentials.provider”: “com.amazonaws.auth.InstanceProfileCredentialsProvider”,
“fs.s3a.server-side-encryption-algorithm”: “AES256”,
“fs.s3.impl”: “org.apache.hadoop.fs.s3a.S3AFileSystem”,

It requires you to fulfill your EMR instance role the policies to access your s3 buckets (source bucket AND druid deep storage if applicable) and kms:Decrypt to your kms key If you use Instance Role Profile, you can also omit the credentials.provider property as long as you don't provide any other credential properties (asInstanceProfileCredentialsProvider is the last checked authentication method, `see https://hadoop.apache.org/docs/r2.8.3/hadoop-project-dist/hadoop-common/core-default.xml for more details)

Hey,
You can use the property “druid.startup.logging.maskProperties” (see https://druid.apache.org/docs/latest/configuration/index.html#startup-logging).

It is by default set to [“password”], you can change it to, say, [“password”, “secretKey”, “awsSecretAccessKey”], in the common.runtime.properties file.

Once you do that, you’ll see something like this in the log (instead of the actual sensitive value):

hadoop.fs.s3n.awsSecretAccessKey: <masked>

Hope that helps :slight_smile:

Itai

Hi,

I have tried this approach but it still displaying the access and secret keys. Do we still have to mention the secret and access key as mentioned below. If I remove then am getting error ‘secret/access key not provided’

“fs.s3a.awsAccessKeyId”: “accesskey”,
“fs.s3a.awsSecretAccessKey”: “secretkey”,
“fs.s3a.impl”: “org.apache.hadoop.fs.s3a.S3AFileSystem”,
“fs.s3a.server-side-encryption-algorithm”: “AES256”,
“fs.s3.impl”: “org.apache.hadoop.fs.s3a.S3AFileSystem”,

Thanks

These are hadoop properties , so can you try can keeping this core-site.xml
fs.s3a.access.key
xxx
fs.s3a.secret.key
xxxxxxx