S3 Signature Version 4 support in newer AWS regions

We’ve encountered issue for indexing service in newer AWS regions such as ap-northeast-1 and eu-central-1. The index.zip are successfully uploaded to S3 but when historical nodes are not able to read them with below error, I believe this might be related to signature version 4 support.

Caused by: org.jets3t.service.impl.rest.HttpException: 400 Bad Request
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:425) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:279) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestHead(RestStorageService.java:1052) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2264) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectDetailsImpl(RestStorageService.java:2193) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:2574) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1773) ~[jets3t-0.9.4.jar:0.9.4]

I’ve seen previous posts on similar issues, is there a fix or workaround on this?
https://groups.google.com/forum/#!searchin/druid-user/Ingetst$20local$20data$20to$20s3$20deep$20storage$20failed|sort:relevance/druid-user/E-Hd0nsY2Wg/m9Z1VwEsBQAJ

https://groups.google.com/forum/#!searchin/druid-user/s3$20signature$20version$204|sort:relevance/druid-user/vpAOj9KIoTg/etHponv4BAAJ

https://groups.google.com/forum/#!searchin/druid-user/s3$20signature$20version$204|sort:relevance/druid-user/VYAySNm7PUw/JHTlSOmFAQAJ

We are using Druid 0.8.3

I was able to solve the problem, after raising the log level to debug, I found below:
2016-08-22 05:08:27,506 DEBUG o.j.s.Jets3tProperties [ZkCoordinator-0] s3service.s3-endpoint=s3.amazonaws.com
2016-08-22 05:08:27,506 DEBUG o.j.s.Jets3tProperties [ZkCoordinator-0] storage-service.request-signature-version=AWS2

Obviously the S3 endpoint is not correct for the region (ap-northeast-1 in my case), also the AWS2 is not the correct signature version either:

Two things are required to fix this:

  1. Added a jets3t.properties file in _common/jets3t.properties
  2. In jets3t.properties, added below lines:
    s3service.s3-endpoint=s3.ap-northeast-2.amazonaws.com
    storage-service.request-signature-version=AWS4-HMAC-SHA256

Then restart historical nodes, the load should be fine now

Some references to reach to this point:
Gian’s comment on turning on the debug log: https://groups.google.com/forum/#!topic/druid-user/efSrQt8a3S8
jets3t support for S3 sigv4: https://bitbucket.org/jmurty/jets3t/issues/183/support-for-aws-signature-version-4

Hope this helps for people using newer AWS regions like eu-central-1 and ap-northeast-1

Awesome, thanks for researching this solution!

Some additional notes:

The above configs will fix the historical nodes not able to read from S3 in those AWS regions. However, once those configs are set, batch index will start to fail with java.io.IOException: Resetting to invalid mark. To fix the entire issue, below is what I have in my _common/jets3t.properties

s3service.s3-endpoint=s3.eu-central-1.amazonaws.com
storage-service.request-signature-version=AWS4-HMAC-SHA256
uploads.stream-retry-buffer-size=2147483646

uploads.stream-retry-buffer-size has to be bigger than the final segment size before uploading to S3. Not entirely sure the reason but that’s the observation.

Caused by: java.lang.RuntimeException: Failed to automatically set required header “x-amz-content-sha256” for request with entity org.jets3t.service.impl.rest.httpclient.RepeatableRequestEntity@36e1eb58
at org.jets3t.service.utils.SignatureUtils.awsV4GetOrCalculatePayloadHash(SignatureUtils.java:259) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.authorizeHttpRequest(RestStorageService.java:778) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:326) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:279) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestPut(RestStorageService.java:1157) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.createObjectImpl(RestStorageService.java:1968) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.putObjectWithRequestEntityImpl(RestStorageService.java:1889) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.putObjectImpl(RestStorageService.java:1881) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.StorageService.putObject(StorageService.java:840) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.S3Service.putObject(S3Service.java:2212) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.S3Service.putObject(S3Service.java:2356) ~[jets3t-0.9.4.jar:0.9.4]
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.storeFile(Jets3tNativeFileSystemStore.java:87) ~[hadoop-common-2.3.0.jar:?]
… 33 more
Caused by: java.io.IOException: Resetting to invalid mark
at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) ~[?:1.8.0_102]
at org.jets3t.service.utils.ServiceUtils.hash(ServiceUtils.java:238) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.utils.ServiceUtils.hashSHA256(ServiceUtils.java:267) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.utils.SignatureUtils.awsV4GetOrCalculatePayloadHash(SignatureUtils.java:251) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.authorizeHttpRequest(RestStorageService.java:778) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:326) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:279) ~[jets3t-0.9.4.jar:0.9.4]
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestPut(RestStorageService.java:1157) ~[jets3t-0.9.4.jar:0.9.4]

Can you please clarify how to provide jets3t.properties config for batch index tasks?

Awesome man!!!

You deserve a beer

where is _common folder of _common/jets3t.properties ?

It’s in conf/_common (or conf-quickstart/_common if you’re using the quickstart config).

Thank you man!!!

Hi,

This solution unfortunately does not work with 0.11.0 version. Is there something else that has to be done ?

Thanks,

Chaitanya

I did the following to use S3 deep storage in eu-central-1:

Build druid 0.11.0 with modifications to use hadoop 2.8.3
ex: https://github.com/druid-io/druid/compare/0.11.0...hoesler:feature/hadoop2.8

git clone https://github.com/hoesler/druid.git

cd druid

git checkout 47290406a5fa01200545ab0825e7500dafdcfaba

mvn clean package -DskipTests

Creates the following files:

  • distribution/target/druid-0.11.0-bin.tar.gz

  • distribution/target/mysql-metadata-storage-0.11.0.tar.gz

Use the druid-hdfs-storage extension with an s3 storage directory. This should work the same way as s3 deep storage. Example relevant part of _common/common.runtime.properties

druid.extensions.loadList=[“druid-s3-extensions”, “mysql-metadata-storage”, “druid-hdfs-storage”]

#druid.storage.type=s3

#druid.storage.bucket=${S3_BUCKET}

#druid.storage.baseKey=druid/segments

druid.s3.accessKey=${S3_ACCESS_KEY_ID}

druid.s3.secretKey=${S3_SECRET_ACCESS_KEY}

druid.storage.type=hdfs

druid.storage.storageDirectory=s3a://${S3_BUCKET}/druid/segments

Have hadoop use S3a. Example relevant part of _common/core-site.xml:

fs.s3a.endpoint

s3.${AWS_REGION}.amazonaws.com

fs.s3.impl

org.apache.hadoop.fs.s3a.S3AFileSystem

fs.s3n.impl

org.apache.hadoop.fs.s3a.S3AFileSystem

fs.s3a.impl

org.apache.hadoop.fs.s3a.S3AFileSystem

fs.s3a.access.key

${S3_ACCESS_KEY_ID}

fs.s3a.secret.key

${S3_SECRET_ACCESS_KEY}

Thanks to https://github.com/hoesler