Using Azure Blob storage as Deep storage for Druid

We are trying to use Azure storage (v2) as Deep storage for Druid (0.17) .

Please let me know how to fix this issue. Appreciate your help

Followed the instructions here https://druid.apache.org/docs/latest/development/extensions-contrib/azure.html

copied the druid-azure-extensions (from https://mvnrepository.com/artifact/io.druid.extensions/druid-azure-extensions/0.8.3) to the /druid/extensions folder.

Also, updated the /druid/…/_comm/common.runtime.properties and added the lines

druid.extensions.loadList=[“druid-hdfs-storage”, “druid-kafka-indexing-service”, “druid-datasketches”, “druid-azure-extensions”]

druid.storage.type=azure

druid.azure.account=[accountname]

druid.azure.key=[key]

druid.azure.container=[containername]

druid.azure.protocol=https

druid.azure.maxTries=3

I see these lines in the coordinator-overlord log, seems to indicate that the azure extensions are loaded succusfully:

2020-04-16T20:25:09,230 INFO [main] org.apache.druid.initialization.Initialization - Loading extension [druid-azure-extensi

ons] for class [interface org.apache.druid.cli.CliCommandCreator]

2020-04-16T20:25:09,230 INFO [main] org.apache.druid.initialization.Initialization - added URL[file:/datadrive/druid/extens

ions/druid-azure-extensions/javax.inject-1.jar] for extension[druid-azure-extensions]

2020-04-16T20:25:09,230 INFO [main] org.apache.druid.initialization.Initialization - added URL[file:/datadrive/druid/extens

ions/druid-azure-extensions/jackson-databind-2.4.6.jar] for extension[druid-azure-extensions]

2020-04-16T20:25:09,230 INFO [main] org.apache.druid.initialization.Initialization - added URL[file:/datadrive/druid/extens

ions/druid-azure-extensions/jackson-module-guice-2.4.6.jar] for extension[druid-azure-extensions]

2020-04-16T20:25:09,231 INFO [main] org.apache.druid.initialization.Initialization - added URL[file:/datadrive/druid/extens

ions/druid-azure-extensions/azure-storage-2.1.0.jar] for extension[druid-azure-extensions]

2020-04-16T20:25:09,232 INFO [main] org.apache.druid.initialization.Initialization - added URL[file:/datadrive/druid/extens

ions/druid-azure-extensions/jackson-annotations-2.4.0.jar] for extension[druid-azure-extensions]

2020-04-16T20:25:09,232 INFO [main] org.apache.druid.initialization.Initialization - added URL[file:/datadrive/druid/extens

ions/druid-azure-extensions/jackson-core-2.4.6.jar] for extension[druid-azure-extensions]

2020-04-16T20:25:09,232 INFO [main] org.apache.druid.initialization.Initialization - added URL[file:/datadrive/druid/extensions/druid-azure-extensions/druid-azure-extensions-0.12.1.jar] for extension[druid-azure-extensions]

2020-04-16T20:25:09,232 INFO [main] org.apache.druid.initialization.Initialization - added URL[file:/datadrive/druid/extensions/druid-azure-extensions/guice-3.0.jar] for extension[druid-azure-extensions]

2020-04-16T20:25:09,232 INFO [main] org.apache.druid.initialization.Initialization - added URL[file:/datadrive/druid/extensions/druid-azure-extensions/aopalliance-1.0.jar] for extension[druid-azure-extensions]

2020-04-16T20:25:09,774 INFO [main] org.apache.druid.initialization.Initialization - Adding implementation [org.apache.druid.common.aws.AWSModule] for class [interface org.apache.druid.initialization.DruidModule] from classpath extension

2020-04-16T20:25:09,778 INFO [main] org.apache.druid.initialization.Initialization - Adding implementation [org.apache.druid.common.gcp.GcpModule] for class [interface org.apache.druid.initialization.DruidModule] from classpath extension

2020-04-16T20:25:09,779 INFO [main] org.apache.druid.initialization.Initialization - Loading extension [druid-hdfs-storage] for class [interface org.apache.druid.initialization.DruidModule]

However, a little bit down I see this as well, not sure if WARN is an issue here

2020-04-16T20:25:09,794 INFO [main] org.apache.druid.initialization.Initialization - Loading extension [druid-azure-extensions] for class [interface org.apache.druid.initialization.DruidModule]

2020-04-16T20:25:12,332 WARN [main] org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform… using builtin-java classes where applicable

2020-04-16T20:25:14,368 WARN [main] org.apache.curator.retry.ExponentialBackoffRetry - maxRetries too large (30). Pinning to 29

in the index logs I see this error:

2020-04-16T20:49:16,513 INFO [main] org.apache.druid.initialization.Initialization - Loading extension [druid-azure-extensi

ons] for class [interface org.apache.druid.initialization.DruidModule]

2020-04-16T20:49:17,230 WARN [main] org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your

platform… using builtin-java classes where applicable

2020-04-16T20:49:18,025 WARN [main] org.apache.curator.retry.ExponentialBackoffRetry - maxRetries too large (30). Pinning t

o 29

2020-04-16T20:49:18,833 ERROR [main] org.apache.druid.cli.CliPeon - Error when starting up. Failing.

com.google.inject.ProvisionException: Unable to provision, see the following errors:

  1. Unknown provider[azure] of Key[type=org.apache.druid.segment.loading.DataSegmentPusher, annotation=[none]], known option

s[[hdfs, local]]

at org.apache.druid.guice.PolyBind.createChoice(PolyBind.java:71) (via modules: com.google.inject.util.Modules$OverrideMo

dule -> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.guice.LocalDataStorageDruidModule)

while locating org.apache.druid.segment.loading.DataSegmentPusher

for the 5th parameter of org.apache.druid.indexing.common.TaskToolboxFactory.(TaskToolboxFactory.java:119)

at org.apache.druid.cli.CliPeon.bindTaskConfigAndClients(CliPeon.java:391) (via modules: com.google.inject.util.Modules$O

verrideModule -> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)

while locating org.apache.druid.indexing.common.TaskToolboxFactory

for the 1st parameter of org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner.(SingleTaskBackgroundRunne

r.java:95)

at org.apache.druid.cli.CliPeon$1.configure(CliPeon.java:200) (via modules: com.google.inject.util.Modules$OverrideModule

-> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)

while locating org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner

while locating org.apache.druid.indexing.overlord.TaskRunner

for the 4th parameter of org.apache.druid.indexing.worker.executor.ExecutorLifecycle.(ExecutorLifecycle.java:80)

at org.apache.druid.cli.CliPeon$1.configure(CliPeon.java:184) (via modules: com.google.inject.util.Modules$OverrideModule

-> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)

while locating org.apache.druid.indexing.worker.executor.ExecutorLifecycle

  1. Error injecting constructor, java.lang.IllegalArgumentException: Can not create a Path from an empty string

at org.apache.druid.storage.hdfs.HdfsDataSegmentKiller.(HdfsDataSegmentKiller.java:47)

while locating org.apache.druid.storage.hdfs.HdfsDataSegmentKiller

at org.apache.druid.storage.hdfs.HdfsStorageDruidModule.configure(HdfsStorageDruidModule.java:93) (via modules: com.googl

e.inject.util.Modules$OverrideModule -> org.apache.druid.storage.hdfs.HdfsStorageDruidModule)

while locating org.apache.druid.segment.loading.DataSegmentKiller annotated with @com.google.inject.multibindings.Element

(setName=,uniqueId=147, type=MAPBINDER, keyType=java.lang.String)

at org.apache.druid.guice.Binders.dataSegmentKillerBinder(Binders.java:40) (via modules: com.google.inject.util.Modules$O

verrideModule -> org.apache.druid.storage.hdfs.HdfsStorageDruidModule -> com.google.inject.multibindings.MapBinder$RealMapBinder)

while locating java.util.Map<java.lang.String, org.apache.druid.segment.loading.DataSegmentKiller>

for the 1st parameter of org.apache.druid.segment.loading.OmniDataSegmentKiller.(OmniDataSegmentKiller.java:38)

while locating org.apache.druid.segment.loading.OmniDataSegmentKiller

at org.apache.druid.cli.CliPeon.bindPeonDataSegmentHandlers(CliPeon.java:355) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)

while locating org.apache.druid.segment.loading.DataSegmentKiller

for the 6th parameter of org.apache.druid.indexing.common.TaskToolboxFactory.(TaskToolboxFactory.java:119)

at org.apache.druid.cli.CliPeon.bindTaskConfigAndClients(CliPeon.java:391) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)

while locating org.apache.druid.indexing.common.TaskToolboxFactory

Have you add the “druid-azure-extensions” to the _common/common.runtime.properties in all the nodes, especially the Data servers?

Thanks for the reply. I am actually using a single server (medium) setup. Once I get this working with one server I am planning to do a multi-node cluster. So, there’s only one server currently, hence only one location for _common/common.runtime.properties

with 0.18 release the Azure extensions are promoted to core extensions and this seems to fix the issue. Thanks for the new release