Azure data lake store attached to hdfs

Has anyone on the forums ever send segment data into hdfs format on azure data lake storage?

If you have please let me know.

More information. I have built a two node Hadoop cluster to support my hdfs filesystem for druid deep storage. I successfully attached Azure Data Lake Gen 1 as non default...but additional storage for my Hadoop/hdfs cluster.

I created my folders and granted permissions on the ADLS using hdfs commands.

I copied over the yarn-site.xml, core-site.xml, mapred-site.xml, and hdfs-site.xml files to the druid classpath locations.

When ingesting data for the first time the task fails with this error on the middlemanager:

java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.adl.AdlFileSystem not found

I have tried both the short and full paths to the folder on the ADLS in the druid deep storage configurations.

Any suggestions?

Thank you.

The Internet says you need “azure-datalake-store.jar”: https://hadoop.apache.org/docs/r2.8.0/hadoop-azure-datalake/index.html. I’d start by getting a copy of that that matches the Hadoop version your Druid is built with (I think 2.8.3 is the most recent) and dropping that into the hadoop-dependencies subdirectory you’re using (by default there is only one in there).

Thank you Gian.

I will give it a shot.