Reindexing task failed

Hi team!

I have trouble with reindexing task.

Here is my reindexing json


{

"type": "index_hadoop",

"spec": {

"dataSchema": {

"dataSource": "myDataSource"

},

"ioConfig": {

"type": "hadoop",

"inputSpec": {

"type": "dataSource",

"ingestionSpec": {

"dataSource": "myDataSource",

"intervals": ["2017-09-26T17:00:00Z/PT1H"]

}

}

},

"tuningConfig": {

"type": "hadoop",

"jobProperties": {

"mapreduce.job.queuename": "root.druid.batch",

"mapreduce.job.classloader": "true",

"mapreduce.job.classloader.system.classes": "-javax.validation.,java.,javax.,org.apache.commons.logging.,org.apache.log4j.,org.apache.hadoop.",

"mapreduce.map.memory.mb": 4096,

"mapreduce.map.java.opts": "-server -Xmx4096m -Duser.timezone=UTC -Dfile.encoding=UTF-8",

"mapreduce.reduce.memory.mb": 8192,

"mapreduce.reduce.java.opts": "-server -Xmx8g -Duser.timezone=UTC -Dfile.encoding=UTF-8"

},

"partitionsSpec": {

"type": "dimension",

"partitionDimension": "pd",

"targetPartitionSize": 8333333

},

"buildV9Directly": "true"

}

},

"hadoopDependencyCoordinates": ["org.apache.hadoop:hadoop-client:2.6.0"]

}

This is the indexer log from overlord web-console ui.


2017-09-26T10:32:50,958 INFO [task-runner-0-priority-0] io.druid.indexing.common.task.HadoopIndexTask - Starting a hadoop determine configuration job...

2017-09-26T10:32:50,988 INFO [task-runner-0-priority-0] io.druid.indexer.path.DatasourcePathSpec - Found total [36] segments for [myDataSource]  in interval [[2017-09-26T17:00:00.000Z/2017-09-26T18:00:00.000Z]]

2017-09-26T10:32:50,988 WARN [task-runner-0-priority-0] io.druid.segment.indexing.DataSchema - No parser has been specified

2017-09-26T10:32:50,989 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_myDataSource_2017-09-26T10:32:43.389Z, type=index_hadoop, dataSource=myDataSource}]

java.lang.RuntimeException: java.lang.reflect.InvocationTargetException

at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:211) ~[druid-indexing-service-0.10.0.jar:0.10.0]

at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:176) ~[druid-indexing-service-0.10.0.jar:0.10.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.10.0.jar:0.10.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.10.0.jar:0.10.0]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_60]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_60]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_60]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]

Caused by: java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_60]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_60]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_60]

at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_60]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:208) ~[druid-indexing-service-0.10.0.jar:0.10.0]

... 7 more

Caused by: java.lang.NullPointerException

at io.druid.indexer.path.DatasourcePathSpec.addInputPaths(DatasourcePathSpec.java:117) ~[druid-indexing-hadoop-0.10.0.jar:0.10.0]

at io.druid.indexer.HadoopDruidIndexerConfig.addInputPaths(HadoopDruidIndexerConfig.java:389) ~[druid-indexing-hadoop-0.10.0.jar:0.10.0]

at io.druid.indexer.JobHelper.ensurePaths(JobHelper.java:337) ~[druid-indexing-hadoop-0.10.0.jar:0.10.0]

at io.druid.indexer.HadoopDruidDetermineConfigurationJob.run(HadoopDruidDetermineConfigurationJob.java:55) ~[druid-indexing-hadoop-0.10.0.jar:0.10.0]

at io.druid.indexing.common.task.HadoopIndexTask$HadoopDetermineConfigInnerProcessing.runTask(HadoopIndexTask.java:306) ~[druid-indexing-service-0.10.0.jar:0.10.0]

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_60]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_60]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_60]

at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_60]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:208) ~[druid-indexing-service-0.10.0.jar:0.10.0]

... 7 more

This task executed via following command


curl -v -L -XPOST -H'Content-Type: application/json' -d @$jsonFile http://$overlord:$overlordPort/druid/indexer/v1/task

Can you plz tell me where is the invalid point?

I read following document several times but i could not found flaws.

And can you give some full sample json?

It would be very useful.

Thanks, have a nice day.


Hello,

You’re missing parser definition in that ingestion spec, you can see an example and more documentation on that at:

http://druid.io/docs/latest/ingestion/batch-ingestion.html

Oh. thanks!

But i wonder why reindexing task need to know about parser spec?

Because datasource and interval is described in json, and those file is located in hdfs.

Then what is the difference between indexing task and re-indexing?

Re-indexing have to know about parser spec detail, then re-indexing is same as batch-indexing after all.

I thought reindexing task as make existing segments(probably many, cause real-time kafka indexing makes a lot of segment per partition and granularity) into

more fine-grained segments.

Can i achieve this via merge task?

Thanks~!

Have a nice day!

2017년 9월 26일 화요일 오후 7시 46분 31초 UTC+9, 기준 님의 말:

Hi,

I’m getting same error, but I do have parser definition.

2018-01-19T22:30:53,858 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_sauce_jobs_ds_2018-01-19T22:30:36.715Z, type=index_hadoop, dataSource=sauce_jobs_ds}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:218) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:226) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.11.0.jar:0.11.0]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.11.0.jar:0.11.0]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_151]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_151]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_151]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_151]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_151]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_151]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:215) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	... 7 more
Caused by: io.druid.java.util.common.ISE: Job[class io.druid.indexer.IndexGeneratorJob] failed!
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:390) ~[druid-indexing-hadoop-0.11.0.jar:0.11.0]
	at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:95) ~[druid-indexing-hadoop-0.11.0.jar:0.11.0]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:279) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_151]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_151]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_151]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:215) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	... 7 more
2018-01-19T22:30:53,870 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_sauce_jobs_ds_2018-01-19T22:30:36.715Z] status changed to [FAILED].
2018-01-19T22:30:53,878 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_sauce_jobs_ds_2018-01-19T22:30:36.715Z",
  "status" : "FAILED",
  "duration" : 12139
}


console.log (29.6 KB)

Then what is the difference between indexing task and re-indexing?

Re-indexing is a type of indexing task where the input data comes from existing Druid segments, they’re conceptually the same type of task.

I thought reindexing task as make existing segments(probably many, cause real-time kafka indexing makes a lot of segment per partition and granularity) into more fine-grained segments. Can i achieve this via merge task?

I don’t think you can reindex existing segments into a granularity finer than what it was originally ingested with (as that “finer-grained” information would have been lost).

The merge task also doesn’t work with segments generated by the Kafka indexing service, the reasoning is explained in the last paragraph of the doc page (http://druid.io/docs/latest/development/extensions-core/kafka-ingestion.html).

I’m getting same error, but I do have parser definition.

The actual error is shown further up in the log you uploaded:

java.lang.Exception: java.io.IOException: No FileSystem for scheme: s3n

	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.7.3.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.7.3.jar:?]
Caused by: java.io.IOException: No FileSystem for scheme: s3n
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660) ~[hadoop-common-2.7.3.jar:?]
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667) ~[hadoop-common-2.7.3.jar:?]
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) ~[hadoop-common-2.7.3.jar:?]
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703) ~[hadoop-common-2.7.3.jar:?]
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685) ~[hadoop-common-2.7.3.jar:?]
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) ~[hadoop-common-2.7.3.jar:?]
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) ~[hadoop-common-2.7.3.jar:?]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:709) ~[druid-indexing-hadoop-0.11.0.jar:0.11.0]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:489) ~[druid-indexing-hadoop-0.11.0.jar:0.11.0]
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) ~[hadoop-mapreduce-client-common-2.7.3.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_151]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_151]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_151]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_151]

I would check that you have the Druid S3 deep storage extension loaded, and try setting your jobProperties in the indexing spec like the following:

"jobProperties" : {
   "fs.s3.awsAccessKeyId" : "YOUR_ACCESS_KEY",
   "fs.s3.awsSecretAccessKey" : "YOUR_SECRET_KEY",
   "fs.s3.impl" : "org.apache.hadoop.fs.s3native.NativeS3FileSystem",
   "fs.s3n.awsAccessKeyId" : "YOUR_ACCESS_KEY",
   "fs.s3n.awsSecretAccessKey" : "YOUR_SECRET_KEY",
   "fs.s3n.impl" : "org.apache.hadoop.fs.s3native.NativeS3FileSystem",
   "io.compression.codecs" : "org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec"

Hi Jonathan,

Thanks for the quick response. After adding the config of “jobProperties” in the index json as what you suggested, and

hadoop-aws-2.7.3.jar in the lib/ as what in posting https://groups.google.com/forum/#!topic/druid-user/HhcMkkbKRXI mentioned,

everything works.

Thanks so much again. :slight_smile:

I’m trying to reindex segments stored in S3 as well but I’m getting errors WRT the S3 credentials. I could really use some help trying to diagnose this.
I’m supplying the credentials in the jobProperties. This works for me if the inputSpec is ‘static’ but fails using ‘dataSource’.
I’m using Druid version 0.11.0. The credentials are also configured in the common config (druid.s3.accessKey, etc).
Here’s there error log:

2018-02-26T19:22:31,729 WARN [task-runner-0-priority-0] org.apache.hadoop.fs.FileSystem - Cannot load filesystem

java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.s3a.S3AFileSystem could not be instantiated
	at java.util.ServiceLoader.fail(ServiceLoader.java:232) ~[?:1.8.0_66-internal]
	at java.util.ServiceLoader.access$100(ServiceLoader.java:185) ~[?:1.8.0_66-internal]
	at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384) ~[?:1.8.0_66-internal]
	at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) ~[?:1.8.0_66-internal]
	at java.util.ServiceLoader$1.next(ServiceLoader.java:480) ~[?:1.8.0_66-internal]
	at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2631) [hadoop-common-2.7.3.jar:?] 

at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2650) [hadoop-common-2.7.3.jar:?]

2018-02-26T19:22:32,056 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_npav-ts-metrics_2018-02-26T19:22:19.303Z, type=index_hadoop, dataSource=npav-ts-metrics}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:218) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:226) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.11.0.jar:0.11.0]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.11.0.jar:0.11.0]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_66-internal]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_66-internal]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_66-internal]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66-internal]
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_66-internal]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_66-internal]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_66-internal]
	at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_66-internal]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:215) ~[druid-indexing-service-0.11.0.jar:0.11.0]
	... 7 more
Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3n URL, or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties (respectively).

Here’s my task request:

{
  "type":"index_hadoop",
  "spec":{
    "dataSchema":{
      "dataSource":"npav-ts-metrics",
      "parser":{
        "type":"hadoopyString",
        "parseSpec":{
          "format":"json",
          "timestampSpec":{
            "column":"timestamp",
            "format":"auto"
          },
          "dimensionsSpec":{
            "dimensions":[
              "tenantId"
            ],
            "dimensionExclusions":[
              "timestamp"
            ]
          }
        }
      },
      "metricsSpec":[
        {
          "name":"rowCount",
          "type":"count"
        },
        {
          "fieldName":"delayMin",
          "name":"delayMin",
          "type":"longMin"
        },
        {
          "fieldName":"delayMax",
          "name":"delayMax",
          "type":"longMax"
        }
      ],
      "granularitySpec":{
        "type":"uniform",
        "segmentGranularity":"DAY",
        "queryGranularity":"NONE",
        "intervals":[
          "2018-02-25T18:00:00.000Z/2018-02-25T19:00:00.000Z"
        ]
      }
    },
    "ioConfig":{
      "type":"hadoop",
      "inputSpec":{
        "type":"dataSource",
        "ingestionSpec":{
          "dataSource":"npav-ts-metrics",
          "intervals":[
            "2018-02-25T18:00:00.000Z/2018-02-25T19:00:00.000Z"
          ]
        }
      }
    },
    "tuningConfig":{
      "type":"hadoop",
      "jobProperties":{,
        "fs.s3.awsAccessKeyId":"MY_ACCESS_KEY",
        "fs.s3n.awsAccessKeyId":"MY_ACCESS_KEY",
        "fs.s3a.awsAccessKeyId":"MY_ACCESS_KEY",
        "fs.s3.awsSecretAccessKey":"MY_SECRET_KEY",
        "fs.s3n.awsSecretAccessKey":"MY_SECRET_KEY",
        "fs.s3a.awsSecretAccessKey":"MY_SECRET_KEY",
        "fs.s3.impl":"org.apache.hadoop.fs.s3native.NativeS3FileSystem",
        "fs.s3n.impl":"org.apache.hadoop.fs.s3native.NativeS3FileSystem",
        "io.compression.codecs":"org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec"
      }
    }
  }
}


Thanks,
LIz

We have the same issue. There is no issue while ingesting the data, but when it comes to reindexing it, it gives us the same error as you. We are sending the following task:

{
    "type" : "index_hadoop",
    "spec" : {
    	"ioConfig" : {
	        "type" : "hadoop",
	        "inputSpec" : {
	            "type" : "dataSource",
	            "ingestionSpec" : {
	                "dataSource": "reporting-cli-clean",
	                "intervals": ["2018-03-03T00:00:00Z/2018-03-03T23:59:59Z"]
	            }
	        }
    	},
        "dataSchema" : {
            "dataSource" : "reporting-cli-clean",
            "granularitySpec": {
                "type": "uniform",
                "segmentGranularity": "day",
                "queryGranularity": "day",
                "intervals": [
                    "2018-03-03T00:00:00Z/2018-03-03T23:59:59Z"
                ]
            },
			"parser" : {
                "type" : "hadoopyString",
                "parseSpec" : {
                    "format": "json",
                    "timestampSpec": {
                        "column": "request_datetime",
                        "format": "YYYY-MM-dd HH:mm:ss"
                    },
                    "dimensionsSpec": {
                        "dimensions": [
                            "whatever",
                        ],
                        "dimensionsExclusions": [],
                        "spatialDimensions": []
                    }
                }
            },
            "metricsSpec": [
                {
                    "name": "count",
                    "type": "count"
                }
            ]
        },
    	"tuningConfig": {
            "type": "hadoop",
            "jobProperties": {
                "fs.s3n.awsAccessKeyId": "xxx",
                "fs.s3n.awsSecretAccessKey": "xxx",
                "fs.s3n.impl": "org.apache.hadoop.fs.s3native.NativeS3FileSystem",
                "fs.s3n.awsAccessKeyId": "xxx",
                "fs.s3n.awsSecretAccessKey": "xxx",
                "fs.s3n.impl": "org.apache.hadoop.fs.s3native.NativeS3FileSystem",
            }
        }
    }
}

Do you have any idea of why this is failing?

Also, why is it necessary to specify the S3 credentials when we are not getting any of the files from S3? Druid shouldn’t need them, right?

Thank you!
Darío.

Yes, exactly the same issue, can some experts help us here?

If you're seeing AWS credentials issues on reindexing tasks, you may be
hitting the following bug (issue has a link to the fix):

Also, why is it necessary to specify the S3 credentials when we are not

getting any of the files from S3? Druid shouldn't need them, right?

The reindexing task would pull its input segments from S3 if that's the
configured deep storage, so the credentials are needed.