History nodes failed to download segments from s3

hello,

It was Okay to generate segments with EMR. I checked the newly created segments in the S3 bucket.

However, when a historical node tried to download the segment from S3, it failed.

Because I am not good at AWS, it comes to me hard to find the problem.

druid : 0.9.1.1

os : Amazon Linux AMI release 2016.03

_common/common.runtime.properties

druid.extensions.loadList=[“druid-s3-extensions”]

druid.storage.type=s3

druid.storage.bucket=druid-bench

druid.storage.baseKey=druid/segments

druid.s3.accessKey=…

druid.s3.secretKey=…

History server log

io.druid.segment.loading.SegmentLoadingException: Exception loading segment[wikiticker_2015-09-12T00:00:00.000Z_2015-09-13T00:00:00.000Z_2016-08-19T09:49:18.904Z]

at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:309) ~[druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:350) [druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.server.coordination.SegmentChangeRequestLoad.go(SegmentChangeRequestLoad.java:44) [druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.server.coordination.ZkCoordinator$1.childEvent(ZkCoordinator.java:152) [druid-server-0.9.1.1.jar:0.9.1.1]

at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:522) [curator-recipes-2.10.0.jar:?]

at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:516) [curator-recipes-2.10.0.jar:?]

at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-2.10.0.jar:?]

at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [guava-16.0.1.jar:?]

at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:85) [curator-framework-2.10.0.jar:?]

at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:514) [curator-recipes-2.10.0.jar:?]

at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) [curator-recipes-2.10.0.jar:?]

at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:772) [curator-recipes-2.10.0.jar:?]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [?:1.7.0_111]

at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_111]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [?:1.7.0_111]

at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_111]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_111]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_111]

at java.lang.Thread.run(Thread.java:745) [?:1.7.0_111]

Caused by: java.lang.RuntimeException: org.jets3t.service.ServiceException: Request Error. – ResponseCode: 400, ResponseStatus: Bad Request, RequestId: FD20B18427AE04C8, HostId: yE5BEAYtDmc8Vr2Sg0QN6PHDGImbxZYmAnbKppdBvYJXELBy5xBXtUssUuFo0fs+pBD/WYnK9vU=

at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]

at io.druid.storage.s3.S3DataSegmentPuller.isObjectInBucket(S3DataSegmentPuller.java:341) ~[?:?]

at io.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:174) ~[?:?]

at io.druid.storage.s3.S3LoadSpec.loadSegment(S3LoadSpec.java:62) ~[?:?]

at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:143) ~[druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:95) ~[druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:152) ~[druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:305) ~[druid-server-0.9.1.1.jar:0.9.1.1]

… 18 more

Caused by: org.jets3t.service.ServiceException: Request Error.

at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:426) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:279) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestHead(RestStorageService.java:1052) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2264) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectDetailsImpl(RestStorageService.java:2193) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:1120) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:575) ~[jets3t-0.9.4.jar:0.9.4]

at io.druid.storage.s3.S3Utils.isObjectInBucket(S3Utils.java:92) ~[?:?]

at io.druid.storage.s3.S3DataSegmentPuller$4.call(S3DataSegmentPuller.java:332) ~[?:?]

at io.druid.storage.s3.S3DataSegmentPuller$4.call(S3DataSegmentPuller.java:328) ~[?:?]

at com.metamx.common.RetryUtils.retry(RetryUtils.java:60) ~[java-util-0.27.9.jar:?]

at com.metamx.common.RetryUtils.retry(RetryUtils.java:78) ~[java-util-0.27.9.jar:?]

at io.druid.storage.s3.S3Utils.retryS3Operation(S3Utils.java:85) ~[?:?]

at io.druid.storage.s3.S3DataSegmentPuller.isObjectInBucket(S3DataSegmentPuller.java:326) ~[?:?]

at io.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:174) ~[?:?]

at io.druid.storage.s3.S3LoadSpec.loadSegment(S3LoadSpec.java:62) ~[?:?]

at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:143) ~[druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:95) ~[druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:152) ~[druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:305) ~[druid-server-0.9.1.1.jar:0.9.1.1]

… 18 more

Caused by: org.jets3t.service.impl.rest.HttpException: 400 Bad Request

at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:425) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:279) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestHead(RestStorageService.java:1052) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2264) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectDetailsImpl(RestStorageService.java:2193) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:1120) ~[jets3t-0.9.4.jar:0.9.4]

at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:575) ~[jets3t-0.9.4.jar:0.9.4]

at io.druid.storage.s3.S3Utils.isObjectInBucket(S3Utils.java:92) ~[?:?]

at io.druid.storage.s3.S3DataSegmentPuller$4.call(S3DataSegmentPuller.java:332) ~[?:?]

at io.druid.storage.s3.S3DataSegmentPuller$4.call(S3DataSegmentPuller.java:328) ~[?:?]

at com.metamx.common.RetryUtils.retry(RetryUtils.java:60) ~[java-util-0.27.9.jar:?]

at com.metamx.common.RetryUtils.retry(RetryUtils.java:78) ~[java-util-0.27.9.jar:?]

at io.druid.storage.s3.S3Utils.retryS3Operation(S3Utils.java:85) ~[?:?]

at io.druid.storage.s3.S3DataSegmentPuller.isObjectInBucket(S3DataSegmentPuller.java:326) ~[?:?]

at io.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:174) ~[?:?]

at io.druid.storage.s3.S3LoadSpec.loadSegment(S3LoadSpec.java:62) ~[?:?]

at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:143) ~[druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:95) ~[druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:152) ~[druid-server-0.9.1.1.jar:0.9.1.1]

at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:305) ~[druid-server-0.9.1.1.jar:0.9.1.1]

… 18 more

I guess the the problem is due to the default region which is s3.amazonaws.com.

Or, are there other problmes for this?

]$ aws s3 ls s3://druid-bench --region ap-northeast-2

PRE druid/

PRE logs/

]$ aws s3 ls s3://druid-bench --region ap-northeast-1

An error occurred (InvalidRequest) when calling the ListObjects operation: You are attempting to operate on a bucket in a region that requires Signature Version 4. You can fix this issue by explicitly providing the correct region location using the --region argument, the AWS_DEFAULT_REGION environment variable, or the region variable in the AWS CLI configuration file. You can get the bucket’s location by running “aws s3api get-bucket-location --bucket BUCKET”.

+1, we’ve seen the same issue in eu-central-1 and ap-northeast-1

I believe these problems are actually just communication problems with S3 that occur from time to time.

The V4 signature problem is real though.

The underlying problems are https://issues.apache.org/jira/browse/HADOOP-9248 and https://issues.apache.org/jira/browse/HADOOP-13325

The confirmed workaround is:

  1. Clone Druid master, add case "s3a": at line 404 of JobHelper.java, change aws-java-sdk to version 1.7.4 in pom.xml, rebuild

  2. In common.runtime.properties, configure S3 deep storage as normal.

  3. Save a file in conf/druid/_common/jets3t.properties with the contents:

s3service.s3-endpoint = s3.ap-northeast-2.amazonaws.com

storage-service.request-signature-version=AWS4-HMAC-SHA256

  1. Run: java -cp “dist/druid/lib/*” -Ddruid.extensions.directory=“dist/druid/extensions” -Ddruid.extensions.hadoopDependenciesDir=“dist/druid/hadoop-dependencies” io.druid.cli.Main tools pull-deps --no-default-hadoop -h “org.apache.hadoop:hadoop-client:2.7.2” -h “org.apache.hadoop:hadoop-aws:2.7.2”

  2. In druid.indexer.runner.javaOpts on middleManager, add -Dcom.amazonaws.services.s3.enableV4

  3. In job json, “hadoopDependencyCoordinates” : [“org.apache.hadoop:hadoop-client:2.7.2”, “org.apache.hadoop:hadoop-aws:2.7.2”]

  4. In job json, “jobProperties” : {

“fs.s3.impl” : “org.apache.hadoop.fs.s3a.S3AFileSystem”,

“fs.s3n.impl” : “org.apache.hadoop.fs.s3a.S3AFileSystem”,

“fs.s3a.endpoint” : “s3.ap-northeast-2.amazonaws.com”,

“fs.s3a.access.key” : “XXX”,

“fs.s3a.secret.key” : “YYY”

}

I’m facing the same issue. Is it fixed or still the same? Using druid-0.9.2.

We had the same issue. Our workaround was to create a new bucket in eu-west-1 which supports Version 2.
http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region

See my comment here for using s3a deep storage:
https://groups.google.com/d/msg/druid-user/i3qK0u5BDGM/iyjShu8EAQAJ