Changing Druid Deep storage type from local to s3

I have started a 3 node druid cluster. But failed to configure deep storage as s3. I am ingesting realtime kafka messages.
I need to configure a deep storage backend, without loosing the data.

This is what I did. I have set druid.indexer.task.restoreTasksOnRestart=true

I restarted the historical and middlemanager on data node. The process came back healthy, but the ingestion job started failing.

2019-06-24T10:09:41,775 ERROR [main] org.apache.druid.cli.CliPeon - Error when starting up. Failing.
com.google.inject.ProvisionException: Unable to provision, see the following errors:

  1. Unknown provider[s3] of Key[type=org.apache.druid.segment.loading.DataSegmentPusher, annotation=[none]], known options[[local]]
    at org.apache.druid.guice.PolyBind.createChoice(PolyBind.java:71) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.guice.LocalDataStorageDruidModule)
    while locating org.apache.druid.segment.loading.DataSegmentPusher
    for the 4th parameter of org.apache.druid.indexing.common.TaskToolboxFactory.(TaskToolboxFactory.java:113)
    at org.apache.druid.cli.CliPeon$1.configure(CliPeon.java:205) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)
    while locating org.apache.druid.indexing.common.TaskToolboxFactory
    for the 1st parameter of org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner.(SingleTaskBackgroundRunner.java:95)
    at org.apache.druid.cli.CliPeon$1.configure(CliPeon.java:244) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)
    while locating org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner
    while locating org.apache.druid.indexing.overlord.TaskRunner
    for the 4th parameter of org.apache.druid.indexing.worker.executor.ExecutorLifecycle.(ExecutorLifecycle.java:79)
    at org.apache.druid.cli.CliPeon$1.configure(CliPeon.java:228) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)
    while locating org.apache.druid.indexing.worker.executor.ExecutorLifecycle

1 error
at com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1028) ~[guice-4.1.0.jar:?]
at com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1050) ~[guice-4.1.0.jar:?]
at org.apache.druid.guice.LifecycleModule$2.start(LifecycleModule.java:136) ~[druid-core-0.14.1-incubating.jar:0.14.1-incubating]
at org.apache.druid.cli.GuiceRunnable.initLifecycle(GuiceRunnable.java:107) [druid-services-0.14.1-incubating.jar:0.14.1-incubating]
at org.apache.druid.cli.CliPeon.run(CliPeon.java:356) [druid-services-0.14.1-incubating.jar:0.14.1-incubating]
at org.apache.druid.cli.Main.main(Main.java:118) [druid-services-0.14.1-incubating.jar:0.14.1-incubating]

``

Hi Amal,

Do you see any errors in the coordinator logs ?

Thanks,

Sashi

Hi Amal,

Looks like your S3 is not properly configured. Take a look at this Druid documentation https://druid.apache.org/docs/latest/development/extensions-core/s3.html

Yeah.

Thanks guys, exactly as you mentioned. The s3 plugin was not loaded properly. I restarted the middle manager and historical. Now the jobs are running fine. I could see today’s data in s3.

What about the old data? Will it bring those automatically ? or idea/ documentation on how to import if required.