Error when using deepstorage S3 from B2/Backblaze S3 Compatible

Hello!

I’m trying for a few hours to deploy druid on k8s using Backblaze S3 as a deep-storage, but I was facing a few errors, I thought it was related to not using ZK initially, thanks Himanshu Gupta for not needing yet another ZK on the cluster, but I switched to ZK temporarily and the behavior continued.

After reading the docs, I set druid.storage.disableAcl=true (assuming, if I’m understanding the phasing correctly, this would disable ACL), this is also configured in this way on this 2018 post Deep storage on Oracle Cloud (S3 compat. API) - #2 by Gian_Merlino

So, when I use this config:

# s3 config
druid.storage.disableAcl=true
druid.s3.enablePathStyleAccess=true
druid.s3.endpoint.url=s3.us-west-004.backblazeb2.com
druid.s3.protocol=https
druid.s3.endpoint.signingRegion=us-west-004
druid.indexer.logs.type=s3
druid.storage.bucket=bucket-name
druid.indexer.logs.s3Bucket=bucket-name
druid.indexer.logs.s3Prefix=druid/indexing-logs

And submit this simple job, the file is correctly stored on the bucket, but failed to be available/queryable, because the historical cannot download the file (the coordinator keep retrying forever, tho, and the “data” is not lost because it’s intelligent enough to keep the data on the indexer while the historical is not ready)

  "type": "index_parallel",
  "spec": {
    "ioConfig": {
      "type": "index_parallel",
      "inputSource": {
        "type": "inline",
        "data": "foo,bar\nfoo,bar\nfoo,bar"
      },
      "inputFormat": {
        "type": "csv",
        "findColumnsFromHeader": true
      }
    },
    "tuningConfig": {
      "type": "index_parallel",
      "partitionsSpec": {
        "type": "dynamic"
      }
    },
    "dataSchema": {
      "dataSource": "inline_data",
      "timestampSpec": {
        "column": "!!!_no_such_column_!!!",
        "missingValue": "2010-01-01T00:00:00Z"
      },
      "dimensionsSpec": {
        "dimensions": [
          "bar",
          "foo"
        ]
      },
      "granularitySpec": {
        "queryGranularity": "none",
        "rollup": false,
        "segmentGranularity": "hour"
      }
    }
  }
}

The historical fails with null point exception,
I changed the cluster logging to TRACE but still no good information went to the log, but I saw that there’s some requests before the download (I don’t think it’s the .zip segment yet) that returned 200:

[druid-druid-cluster-historicals-0] 2022-03-29T21:34:09,929 TRACE [SimpleDataSegmentChangeHandler-0] com.amazonaws.request - Done parsing service response XML
[druid-druid-cluster-historicals-0] 2022-03-29T21:34:09,929 DEBUG [SimpleDataSegmentChangeHandler-0] com.amazonaws.request - Received successful response: 200, AWS Request ID: cf9e800baf6a3ebc
[druid-druid-cluster-historicals-0] 2022-03-29T21:34:09,929 DEBUG [SimpleDataSegmentChangeHandler-0] com.amazonaws.requestId - x-amzn-RequestId: not available
[druid-druid-cluster-historicals-0] 2022-03-29T21:34:09,929 DEBUG [SimpleDataSegmentChangeHandler-0] com.amazonaws.requestId - AWS Request ID: cf9e800baf6a3ebc
[druid-druid-cluster-historicals-0] 2022-03-29T21:34:09,930 ERROR [SimpleDataSegmentChangeHandler-0] org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager - Failed to load segment in current location [/druid/data/segment-cache], try next location if any: {class=org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager, exceptionType=class org.apache.druid.segment.loading.SegmentLoadingException, exceptionMessage=null, location=/druid/data/segment-cache}
[druid-druid-cluster-historicals-0] org.apache.druid.segment.loading.SegmentLoadingException: null
[druid-druid-cluster-historicals-0] 	at org.apache.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:135) ~[?:?]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.storage.s3.S3LoadSpec.loadSegment(S3LoadSpec.java:61) ~[?:?]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.loadInLocation(SegmentLoaderLocalCacheManager.java:304) ~[druid-server-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.loadInLocationWithStartMarker(SegmentLoaderLocalCacheManager.java:292) ~[druid-server-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.loadSegmentWithRetry(SegmentLoaderLocalCacheManager.java:253) ~[druid-server-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:225) ~[druid-server-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:186) ~[druid-server-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.server.SegmentManager.getAdapter(SegmentManager.java:278) ~[druid-server-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.server.SegmentManager.loadSegment(SegmentManager.java:224) ~[druid-server-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.server.coordination.SegmentLoadDropHandler.loadSegment(SegmentLoadDropHandler.java:272) ~[druid-server-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.server.coordination.SegmentLoadDropHandler.addSegment(SegmentLoadDropHandler.java:320) ~[druid-server-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.server.coordination.SegmentLoadDropHandler$1.lambda$addSegment$1(SegmentLoadDropHandler.java:529) ~[druid-server-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_275]
[druid-druid-cluster-historicals-0] 	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_275]
[druid-druid-cluster-historicals-0] 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_275]
[druid-druid-cluster-historicals-0] 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_275]
[druid-druid-cluster-historicals-0] 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_275]
[druid-druid-cluster-historicals-0] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_275]
[druid-druid-cluster-historicals-0] 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
[druid-druid-cluster-historicals-0] Caused by: java.lang.NullPointerException
[druid-druid-cluster-historicals-0] 	at org.apache.druid.storage.s3.S3Utils.getSingleObjectSummary(S3Utils.java:200) ~[?:?]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.storage.s3.S3DataSegmentPuller.buildFileObject(S3DataSegmentPuller.java:154) ~[?:?]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.storage.s3.S3DataSegmentPuller.access$000(S3DataSegmentPuller.java:57) ~[?:?]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.storage.s3.S3DataSegmentPuller$1.openStream(S3DataSegmentPuller.java:91) ~[?:?]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.utils.CompressionUtils.lambda$unzip$1(CompressionUtils.java:188) ~[druid-core-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:87) ~[druid-core-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:115) ~[druid-core-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:105) ~[druid-core-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.utils.CompressionUtils.unzip(CompressionUtils.java:187) ~[druid-core-0.21.1.jar:0.21.1]
[druid-druid-cluster-historicals-0] 	at org.apache.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:104) ~[?:?]
[druid-druid-cluster-historicals-0] 	... 18 more

(full logs on after more explanation)
If I enter the console or use another methods I can download the file, if I enter the pod/container and I was able to touch files on /druid/data/segment-cache/inline_data/ dir (as a matter of fact, after the job retry, this dir is removed)

When I tried this config using the real S3 (which unfortunately I cannot use for this setup), I got one exception telling that the bucket did not support ACL, thus could not download the file (I’m sorry, I already deleted this bucket and access-key/secret, I don’t have the logs anymore), but then I changed from druid.storage.disableAcl=true to druid.storage.disableAcl=false and the task completed successfully using the real AWS S3
The real S3 bucket had ACL disabled, as recommended by the AWS on the new bucket page.

I was happy that this worked (I thought I would just change this flag and use de B2 S3 successfully), but a little confused by the phasing (disabledAcl=false when i thought it should be disabledAcl=true for disabling this feature).
When I tried again with the Backblaze S3, for my surprise, now the task fails during the publishing phase, with this error:

[druid-druid-cluster-coordinators-0] 2022-03-29T21:25:11,684 INFO [forking-task-runner-0] org.apache.druid.storage.s3.S3Utils - Pushing [/druid/data/persistent/task/index_parallel_inline_data_boidgomo_2022-03-29T21:24:24.584Z/log] to bucket[bucket-name] and key[druid/indexing-logs/index_parallel_inline_data_boidgomo_2022-03-29T21:24:24.584Z/log].
[druid-druid-cluster-coordinators-0] 2022-03-29T21:25:11,753 INFO [forking-task-runner-0] org.apache.druid.indexing.overlord.ForkingTaskRunner - Exception caught during execution
[druid-druid-cluster-coordinators-0] java.lang.RuntimeException: com.amazonaws.services.s3.model.AmazonS3Exception: Backblaze does not support the 'x-amz-grant-full-control' header for this API call. (Service: Amazon S3; Status Code: 400; Error Code: InvalidArgument; Request ID: null; S3 Extended Request ID: null), S3 Extended Request ID: null
[druid-druid-cluster-coordinators-0] 	at org.apache.druid.storage.s3.S3TaskLogs.pushTaskFile(S3TaskLogs.java:156) ~[?:?]
[druid-druid-cluster-coordinators-0] 	at org.apache.druid.storage.s3.S3TaskLogs.pushTaskLog(S3TaskLogs.java:133) ~[?:?]
[druid-druid-cluster-coordinators-0] 	at org.apache.druid.indexing.overlord.ForkingTaskRunner$1.call(ForkingTaskRunner.java:386) [druid-indexing-service-0.21.1.jar:0.21.1]
[druid-druid-cluster-coordinators-0] 	at org.apache.druid.indexing.overlord.ForkingTaskRunner$1.call(ForkingTaskRunner.java:137) [druid-indexing-service-0.21.1.jar:0.21.1]
[druid-druid-cluster-coordinators-0] 	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_275]
[druid-druid-cluster-coordinators-0] 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_275]
[druid-druid-cluster-coordinators-0] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_275]
[druid-druid-cluster-coordinators-0] 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]

[druid-druid-cluster-coordinators-0] Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Backblaze does not support the 'x-amz-grant-full-control' header for this API call. (Service: Amazon S3; Status Code: 400; Error Code: InvalidArgument; Request ID: null; S3 Extended Request ID: null)
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1638) ~[aws-java-sdk-core-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1303) ~[aws-java-sdk-core-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1055) ~[aws-java-sdk-core-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) ~[aws-java-sdk-core-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) ~[aws-java-sdk-core-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) ~[aws-java-sdk-core-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) ~[aws-java-sdk-core-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) ~[aws-java-sdk-core-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) ~[aws-java-sdk-core-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4229) ~[aws-java-sdk-s3-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4176) ~[aws-java-sdk-s3-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1720) ~[aws-java-sdk-s3-1.11.199.jar:?]
[druid-druid-cluster-coordinators-0] 	at org.apache.druid.storage.s3.ServerSideEncryptingAmazonS3.putObject(ServerSideEncryptingAmazonS3.java:119) ~[?:?]
[druid-druid-cluster-coordinators-0] 	at org.apache.druid.storage.s3.S3Utils.uploadFileIfPossible(S3Utils.java:286) ~[?:?]
[druid-druid-cluster-coordinators-0] 	at org.apache.druid.storage.s3.S3TaskLogs.lambda$pushTaskFile$0(S3TaskLogs.java:149) ~[?:?]
[druid-druid-cluster-coordinators-0] 	at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:87) ~[druid-core-0.21.1.jar:0.21.1]
[druid-druid-cluster-coordinators-0] 	at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:115) ~[druid-core-0.21.1.jar:0.21.1]

[druid-druid-cluster-coordinators-0] 	at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:105) ~[druid-core-0.21.1.jar:0.21.1]
[druid-druid-cluster-coordinators-0] 	at org.apache.druid.storage.s3.S3Utils.retryS3Operation(S3Utils.java:91) ~[?:?]
[druid-druid-cluster-coordinators-0] 	at org.apache.druid.storage.s3.S3TaskLogs.pushTaskFile(S3TaskLogs.java:147) ~[?:?]

If I change back to druid.storage.disableAcl=false, I still got the same error as before, here is the full historical log with trace enabled:

trace.hist.txt · GitHub (too long for this forum)

So, anyone know how I can debug this issue further? From the bellow post, since 2018 druid uses the the aws-java-sdk, so maybe there’s some field the SDK is expecting during the parser of the List request, that backblaze b2 is missing, and we can’t fix this, except perhaps, by not calling the list at all (why it’s listing the file instead of just downloading? maybe there’s one check “file exists” that could be removed)

Some options I’m thinking is, testing another S3 provider, like DigitalOcean, or using minio (that I know that many people uses) as a proxy for the b2, as I don’t see any advantage in running my own minio to keep the data on cloud block-storage

Thank you for read this further,
Renato

OK so this is a total stab in the dark (!) but I wonder if “precedence” might either explain it – or may even be something you could use to work around it…

I’m only using the 3th method, that’s Environment variables:

kind: "Druid"
metadata:
  name: druid-cluster
spec:
  image: apache/druid:0.21.1
  #... other configs, like commonConfigMountPath, jvm.options, common.runtime.properties 
  env:
    - name: POD_NAME
      valueFrom:
        fieldRef:
          fieldPath: metadata.name
    - name: POD_NAMESPACE
      valueFrom:
        fieldRef:
          fieldPath: metadata.namespace
    - name: AWS_REGION
      value: "us-west-004"
    - name: AWS_ACCESS_KEY_ID
      value: "000000000000000000000000"
    - name: AWS_SECRET_ACCESS_KEY
      value: "K0000000000000000000000000"

Here is the full spec I’m using: druid-k8s-s3.yaml · GitHub

I started from the tiny example I found online with S3, I’m feeling that after I resolve I need to add MM to make real-time ingestion work, but that’s simple (this PR has one example for that druid-operator/do-druid.yaml at bfc62aa5b44b6fd5455d1c211100cbddd8c0ba2a · druid-io/druid-operator · GitHub )

I think I will try again with TRACE using the real-s3 to compare the XML of the Listing call it do before the null point exception.

If I find some difference, maybe I can envolve backblaze or try DigitalOcean ObjectStorage as shown in the PR example, so it must be working with their implementation with disableAcl=false (it’s missing on that PR, and that the default value, that means that DO has s3:GetBucketAcl and s3:PutObjectAcl api, where B2 has partial support for ACL as stated in www(.)backblaze(.)com/b2/docs/s3_compatible_api.html)

Hi @Renato_Cron. You might also ask on the development mailing list.

I had all the bellow content written when I realized that maybe there’s another page, with another disableAcl somewhere… and in fact, there is.

druid.storage.disableAcl=true
druid.indexer.logs.disableAcl=true

druid.indexer.logs.disableAcl was set to false (the default value).

I think that should be cited on the S3-compatible · Apache Druid page as well, or to be removed and follow the value of druid.storage.disableAcl, as do not make sense to have one enabled and the other disabled (AFAIK)

The sad part is, with the above config, the import job did successfully completed, but I back stuck to 0% availability
image
due to the initial org.apache.druid.segment.loading.SegmentLoadingException: null at org.apache.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:135) ~[?:?] (full log on https://gist.githubusercontent.com/renatocron/1f4ade2fbfb3d871af6433ff76334187/raw/18d62ce97c0056054f90fff505a62340aac2583d/gistfile1.txt)

So, for now, I’ll try again with real S3 and both Acl disabled. If this successfully import, I will try DigitalOcean object spaces.

Proxying/gateway b2 via minio is not supported anymore, trying using with the s3 api also did not worked great locally.

At this point, there’s not even an monetary incentive to keep trying to use b2, the savings is not worth anymore LOL.

So, the bellow content is just for references and/or other people to find this page in the future:

Bellow content:

I tried again with a real s3 to capture the logs, using druid.storage.disableAcl=true is not really disabling the all the ACLs, I created the bucket as follow:

The segment file is successfully created on the bucket (s3://test-druid-gl/druid/segments/inline_data/2010-01-01T00:00:00.000Z_2010-01-01T01:00:00.000Z/2022-03-30T21:26:27.490Z/0/index.zip), but the coordinator index log returns 400 on the PUT operation for the index:

[druid-druid-cluster-coordinators-0] 2022-03-30T21:26:54,994 DEBUG [forking-task-runner-0] org.apache.http.wire - http-outgoing-0 << “AccessControlListNotSupportedThe bucket does not allow ACLsAJ17A3EMV4FSWDQ4/rVV7CbiEPrK1BdY8InAGVOS976YNyUXYIA0zP1fbKFtE+OAl7txYrHLGv1ZlnvBcch/cqkLW14=[\r][\n]”
druid-druid-cluster-coordinators-0 partial (very long, traces!) log at log.txt · GitHub

So maybe there’s two code path, one that respects the flag and other that does.

strangely, I don’t find the log for the segment upload, I even tried again with datasource-name=“easyto_find_data_sc_name” but only found references to the index (/test-druid-gl/druid/indexing-logs/index_parallel_easyto_find_data_sc_name_kbmhjfin_2022-03-30T22%3A04%3A08.714Z/log) and not the segment

$ grep easyto_find_data_sc_name /tmp/all.logs | grep PUT

[druid-druid-cluster-coordinators-0] 2022-03-30T22:04:43,441 DEBUG [forking-task-runner-0] com.amazonaws.request - Sending Request: PUT https://s3.us-east-1.amazonaws.com /test-druid-gl/druid/indexing-logs/index_parallel_easyto_find_data_sc_name_kbmhjfin_2022-03-30T22%3A04%3A08.714Z/log Headers: (x-amz-grant-full-control: id="350398d3f04aa5f6dde4c0a88687e7ad8da9400a2c4833fa89249466bbe63f03", id="350398d3f04aa5f6dde4c0a88687e7ad8da9400a2c4833fa89249466bbe63f03", User-Agent: aws-sdk-java/1.11.199 Linux/4.19.0-17-amd64 OpenJDK_64-Bit_Server_VM/25.275-b01 java/1.8.0_275, amz-sdk-invocation-id: e66f3e08-62cd-8949-9bee-8d77f0ddb683, Content-Length: 484153, Content-MD5: k14ipq73QBoB9v43/yZwXg==, Content-Type: application/octet-stream, )  
[druid-druid-cluster-coordinators-0] 2022-03-30T22:04:43,442 DEBUG [forking-task-runner-0] org.apache.http.impl.execchain.MainClientExec - Executing request PUT /test-druid-gl/druid/indexing-logs/index_parallel_easyto_find_data_sc_name_kbmhjfin_2022-03-30T22%3A04%3A08.714Z/log HTTP/1.1 
[druid-druid-cluster-coordinators-0] 2022-03-30T22:04:43,442 DEBUG [forking-task-runner-0] org.apache.http.headers - http-outgoing-4 >> PUT /test-druid-gl/druid/indexing-logs/index_parallel_easyto_find_data_sc_name_kbmhjfin_2022-03-30T22%3A04%3A08.714Z/log HTTP/1.1 
[druid-druid-cluster-coordinators-0] 2022-03-30T22:04:43,443 DEBUG [forking-task-runner-0] org.apache.http.wire - http-outgoing-4 >> "PUT /test-druid-gl/druid/indexing-logs/index_parallel_easyto_find_data_sc_name_kbmhjfin_2022-03-30T22%3A04%3A08.714Z/log HTTP/1.1[\r][\n]" 

After the import, the console shows that the segment is ‘100% available’ using S3 and disabledAcl=true, but the task is failed with above errors, but it’s queryable (as it still leaves somewhere in the cluster, I’m not sure which pod)

After marking the segment unused and send a kill task, the segment file is removed from S3, but as something is not expected also happen, in the end, all the job status is failed

Starting again, with real s3, and the default value, same bucket with druid.storage.disableAcl=false
then, I got this error:

[druid-druid-cluster-coordinators-0] 2022-03-30T22:47:08,074 ERROR [forking-task-runner-0] org.apache.druid.indexing.overlord.TaskQueue - Failed to run task: {class=org.apache.druid.indexing.overlord.TaskQueue, exceptionType=class java.lang.RuntimeException, exceptionMessage=java.lang.RuntimeException: com.amazonaws.services.s3.model.AmazonS3Exception: The bucket does not allow ACLs (Service: Amazon S3; Status Code: 400; Error Code: AccessControlListNotSupported; Request ID: ZYKNBCGH229MBPNX; S3 Extended Request ID: bx6vhfB4dzh91XYCOr1I1jCQzPOl2mLWrxGgBBz20Sgn+nqd614y7ZEFvg080sLtPFcPJlTJCNo=), S3 Extended Request ID: bx6vhfB4dzh91XYCOr1I1jCQzPOl2mLWrxGgBBz20Sgn+nqd614y7ZEFvg080sLtPFcPJlTJCNo=, task=index_parallel_easy_s3_disabled_acl_eq_false_linemdcd_2022-03-30T22:46:20.595Z, type=index_parallel, dataSource=easy_s3_disabled_acl_eq_false} 

I would had posted the full log if I did not ran this command base64 /tmp/s3.logs.txt.gz > /tmp/s3.logs.txt.gz

no datasource is even created, yet again, no signs of the segment file in the log, just the index. No file was created on S3 with this config.

So far, what I know:
using real s3:
druid.storage.disableAcl=false does in fact enable the ACLs - so all s3 buckets with this function disabled is not supposed to be working (at least, on 0.22.1)

druid.storage.disableAcl=true partially disables it, the segment is created, but the task fails because it’s index log (I don’t know for what this file is used for¹edited, I thought all index was saved inside the segment)

using b2 s3:
with druid.storage.disableAcl=true, as B2 buckets does support some parts of the ACLs, the segment is created, but then fail with a null pointer exception before the historical tries to download it.
with druid.storage.disableAcl=false, B2 returns:

[druid-druid-cluster-coordinators-0] 2022-03-30T23:57:51,973 ERROR [forking-task-runner-0] org.apache.druid.indexing.overlord.TaskQueue - Failed to run task: {class=org.apache.druid.indexing.overlord.TaskQueue, exceptionType=class java.lang.RuntimeException, exceptionMessage=java.lang.RuntimeException: com.amazonaws.services.s3.model.AmazonS3Exception: Backblaze does not support the ‘x-amz-grant-full-control’ header for this API call. (Service: Amazon S3; Status Code: 400; Error Code: InvalidArgument; Request ID: null; S3 Extended Request ID: null), S3 Extended Request ID: null, task=index_parallel_easy_b2_disabled_acl_eq_false_jkcenfoe_2022-03-30T23:57:04.186Z, type=index_parallel, dataSource=easy_b2_disabled_acl_eq_false}

– edit: the creds are all rotated each post, so no problem posting the traces logs

¹: with druid.indexer.logs.disableAcl=true I saw, it’s just the indexer process log and a json file with it’s stats, not related to the segment at all

So, real s3, same bucket created on the post before this

druid.storage.disableAcl=true
druid.indexer.logs.disableAcl=true

everything works as expected

I tried with DigitalOcean Spaces, and it also worked, so the null point is something related to the b2 implementation :confused:

Another thing, I only managed to get any implementation working when using

druid.s3.enablePathStyleAccess=true

if I set it to false, and use the full endpoint, eg:

druid.s3.endpoint.url=MY_BUCKET_NAME.nyc3.digitaloceanspaces.com

the job fails with 404, but that’s a minor detail