Hi,
I’m currently working on upgrading our clusters from 0.15.1-incubating to 0.22.1.
After reading all changelogs to identify what issues I could encounter and conclude that there was none, I tried to update a test cluster.
Rolling update was done following recommandations :http://druid.io/docs/0.9.0-rc1/operations/rolling-updates.html
And everything went well. I can still query my data, all nodes are up (after fixing some stuffs, but just minor fixes).
Honnestly, that was very nice.
The last part I had to test is my ingestion tasks.
All my ingestions tasks are based on the same pattern :
- hadoop ingestion, source are parquet files stored in a aws s3 bucket.
Previously (in 0.15.1), to make that ingestion work, I had to manually add aws hadoop dependencies in my middleManager nodes so that it could work.
After reading changelogs, I noticed that point in 0.18 release : https://github.com/apache/druid/releases#upgrade-hadoop-aws
Nice, I can get rid of adding the hadoop dependencies and use the out-of-the-box ones.
However, this is not working as intended.
With exact same ingestion task, I ran into this error :
2022-01-18T18:05:03,859 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at emr-cluster/172.30.xx.xx:8032
2022-01-18T18:05:04,364 WARN [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobResourceUploader - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2022-01-18T18:05:04,374 WARN [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2022-01-18T18:05:04,759 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobSubmitter - Cleaning up the staging area /tmp/hadoop-yarn/staging/druid/.staging/job_1640710029235_6755
2022-01-18T18:05:04,763 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.common.task.HadoopIndexTask - Encountered exception in HadoopIndexGeneratorInnerProcessing.
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
at org.apache.druid.indexer.IndexGeneratorJob.run(IndexGeneratorJob.java:242) ~[druid-indexing-hadoop-0.22.1.jar:0.22.1]
at org.apache.druid.indexer.JobHelper.runJobs(JobHelper.java:399) ~[druid-indexing-hadoop-0.22.1.jar:0.22.1]
at org.apache.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:100) ~[druid-indexing-hadoop-0.22.1.jar:0.22.1]
at org.apache.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessingRunner.runTask(HadoopIndexTask.java:834) [druid-indexing-service-0.22.1.jar:0.22.1]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_312]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_312]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_312]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_312]
at org.apache.druid.indexing.common.task.HadoopIndexTask.runInternal(HadoopIndexTask.java:442) [druid-indexing-service-0.22.1.jar:0.22.1]
at org.apache.druid.indexing.common.task.HadoopIndexTask.runTask(HadoopIndexTask.java:284) [druid-indexing-service-0.22.1.jar:0.22.1]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:159) [druid-indexing-service-0.22.1.jar:0.22.1]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:471) [druid-indexing-service-0.22.1.jar:0.22.1]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:443) [druid-indexing-service-0.22.1.jar:0.22.1]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_312]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_312]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_312]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_312]
oh. Not nice at all.
This S3AFileSystem class should be located in a hadoop-aws-.jar, according to https://hadoop.apache.org/docs/r3.1.2/hadoop-aws/tools/hadoop-aws/troubleshooting_s3a.html#ClassNotFoundException:_org.apache.hadoop.fs.s3a.S3AFileSystem.
When lookin into my Druid node, I can find this class only here
sh-4.2$ find /opt/druid-0.22.1 -name ‘*.jar’ -print | while read i; do unzip -l “$i” | grep -Hsi S3AFileSystem && echo “$i”; done
(standard input): 716 09-10-2018 11:56 org/apache/hadoop/fs/s3a/S3AFileSystem$3.class
(standard input): 737 09-10-2018 11:56 org/apache/hadoop/fs/s3a/S3AFileSystem$4.class
(standard input): 8140 09-10-2018 11:56 org/apache/hadoop/fs/s3a/S3AFileSystem$WriteOperationHelper.class
(standard input): 960 09-10-2018 11:56 org/apache/hadoop/fs/s3a/S3AFileSystem$1.class
(standard input): 54631 09-10-2018 11:56 org/apache/hadoop/fs/s3a/S3AFileSystem.class
(standard input): 1238 09-10-2018 11:56 org/apache/hadoop/fs/s3a/S3AFileSystem$2.class
/opt/druid-0.22.1/extensions/druid-hdfs-storage/hadoop-aws-2.8.5.jar
The only hadoop-aws jar is located under druid-hdfs-storage extension folder and I am not using this extension.
Nothing under the hadoop-dependencies folder.
Did I misunderstood the changelog message ? Is it applicable to only hdfs-storage extension ?
I thought I could also use the aws hadoop libs for my ingestion tasks.
NB : I know I could know use the native ingestion that is now capable of reading parquet files (this was not the case back in 0.15.1).
But we can’t work on this for the moment.
Any hint will be appreciated