Issues with hdfs as deepstorage

I set up an indexing service using hdfs as deep storage but failed to output anything to hdfs, instead task logs and segments all went to local storage.

Here is what I did:

In _common/common.runtime.properties:

druid.extensions.coordinates=[“io.druid.extensions:druid-kafka-eight”,“io.druid.extensions:mysql-metadata-storage”,“io.druid.extensions:druid-hdfs-storage”]
druid.storage.type=hdfs
druid.storage.storage.storageDirectory=hdfs://XXXX:8020/user/UUUU/druid
druid.indexer.task.hadoopWorkingPath=hdfs://XXXX:8020/user/UUUUdruid

In middleManager’s runtime.properties:

druid.indexer.logs.type=hdfs
druid.indexer.logs.directory=hdfs://XXXX:8020/user/UUUU//druid/log

In druid startup script, compose the classpath as

DRUID_CP={SCRIPT_DIR}/config/_common DRUID_CP={DRUID_CP}:{SCRIPT_DIR}/config/middelManager|overlord|historical... DRUID_CP={DRUID_CP}:{SCRIPT_DIR}/lib/* DRUID_CP={DRUID_CP}:hadoop classpath

I followed the suggestion by https://groups.google.com/forum/#!searchin/druid-user/hdfs$20deepstorage$20/druid-user/kQMzQpf9ZXY/QSdPiqELMz8J.

Hi - I think it should be:

druid.storage.storageDirectory=hdfs://XXXX:8020/user/UUUU/druid

Instead of druid.storage.storage.storageDirectory.

Zach,

Thank you for pointing out. Yes it is the culprit. The segment data started to show in the hdfs now.

One more question, do you know why the task logs did not go to hdfs, despite the setting:

druid.indexer.logs.type=hdfs
druid.indexer.logs.directory=hdfs://XXXX:8020/user/UUUU//druid/log

In the middleManager’s runtime.properties?

Task logs are uploaded to HDFS when the task completes. When you say the task is not uploaded to HDFS, do you mean after the task completes you cannot see the log in hdfs? That config looks fine to me.