How to config druid to use non-AWS provided S3 as deep storage?

I am using druid 0.9 and I tried to config druid to use non-AWS provided S3 as deep storage.

How do I specify the S3 endpoint in the druid config?

The S3 extension for deep storage uses jets3t under the hood. You should be able to have a jets3t.properties on the class path, but I haven’t really experimented with it.

Su -

Here are the properties you need to set in conf/_common/common.runtime.properties -

druid.storage.type=s3

druid.storage.bucket=bucketname

druid.storage.baseKey=druid/segments

druid.s3.accessKey=S3_ACCESS_KEY

druid.s3.secretKey=S3_SECRET_KEY

Replace bucket name / access key / secret key according to your S3 settings. Disable other storage types that you may have set previously - local / hdfs

In case you are behind a proxy, you need to enable the proxy and set-up the proxy end-point for S3 using jets3t.properties properties file. Create this file and copy into the _common directory.

You will the following properties in there -

httpclient.proxy-autodetect=false

httpclient.proxy-host=proxy-url

httpclient.proxy-port=proxy-port

Adjust your proxy-url and port accordingly.

[I used this in my POC, and it works]

Hope that helps.

Hi Jagadeesh,

Your message is very helpful. I am behind a proxy. How do I config the endpoint to access S3 in jets3t.properties?
Does something like this work?
s3service.s3-endpoint=http://mys3-api-endpoint.opst.mtlabs.com:5000/v2.0

Difu

What you specify in jets3t.properties file are you proxy hostname and port. All S3 settings would go into common.runtime.properties. Please look my above post for sample.

I find in http://www.jets3t.org/toolkit/configuration.html that RestS3Service related config needs to be set in jets3t.properties and this indluces `s3service.s3-endpoint. Do you set this config? Do you set it in ``common.runtime.properties? Should I set the whole URL or just the host name?

Difu`

I added a jets3t.properties file with the following config and it worked for me:

s3service.s3-endpoint=<non_aws_service_host>

s3service.s3-endpoint-http-port=<non_aws_service_port>

s3service.disable-dns-buckets=true

s3service.https-only=false

Saravana

And of course common.runtime.properties file had the following:

druid.storage.type=s3

druid.s3.accessKey=<access_key>

druid.s3.secretKey=<secret_key>

druid.storage.bucket=<bucket_name>

druid.storage.baseKey=<sub_directory_within_the_bucket>