Setup druid behind firewall

I am trying to understand how druid handles dependencies. I am setting up druid behind a corporate firewall and cannot run the pull-deps command. However I see that the druid distribution already comes packaged with the needed extensions inside the extensions-repo folder. In my common.runtime.properties I have set

druid.extensions.coordinates= – so that druid does not try to download dependencies from maven

druid.extensions.localRepository=extensions-repo – local repository folder. Also tried the complete path

druid.extensions.remoteRepository=

When I use the above should I expect druid to locate the required dependencies from “extensions-repo” ?

I found that when I have some values inside “druid.extensions.coordinates” it tries to go to maven even if the required jar is inside “extensions-repo”

I have read in another discussion that adding the required extension jar to the classpath works. Is that the best process to follow? It looks like I might have to add each and every jar (dependencies and the nested dependencies) to the classpath in order to get this to work.

What is the best way to setup druid behind a firewall?

Hi Amol, this might be useful for you:
http://druid.io/docs/latest/operations/including-extensions.html

You can also take a look at http://imply.io/docs/latest/, which bundles all extensions in a single distribution.

Hi Fangjin,

I am been following these instructions:

I want classloader isolation, but I don’t want my production machines downloading their own dependencies. What should I do?

If you want to take advantage of the maven-based classloader isolation but you are also rightly frightened by the prospect of each of your production machines downloading their own dependencies on deploy, this section is for you.

The trick to doing this is

  1. Specify a local directory for druid.extensions.localRepository

  2. Run the tools pull-deps command to pull all the specified dependencies down into your local repository

  3. Bundle up the local repository along with your other Druid stuff into whatever you use for a deployable artifact

I bundled the extensions-repo folder into a jar and pointed druid.extensions.localRepository to this. That didn’t work. Also just pointing druid.extensions.localRepository to extensions-repo folder itself doesn’t work even though the jar is present inside extensions-repo. How do you expect the local repository to be packaged? tar? jar? gz?

  1. Run Your druid processes with druid.extensions.remoteRepositories=[] and a local repository set to wherever your bundled “local” repository is located

The Druid processes will then only load up jars from the local repository and will not try to go out onto the internet to find the maven dependencies.

Hi Fangjin,

I don’t want to try out “imply” just yet.

Can I keep the properties as below and expect druid to load all dependencies from the “extensions-repo” folder? I have all required jars in that folder along with correct mavenlike folder structure.

druid.extensions.coordinates=

druid.extensions.localRepository=extensions-repo

druid.extensions.remoteRepository=

When I do the above druid fails to load the extensions. I was trying to load kafka.

Thanks,

Amol Purohit

Hey Amol,

You should put the extensions you want to load inside “druid.extensions.coordinates” and also pull them down, and their dependencies, into the local extensions-repo. So if you want to load just the mysql extension, it should look like this,

druid.extensions.coordinates=[“io.druid.extensions:mysql-metadata-storage”]

druid.extensions.localRepository=extensions-repo

druid.extensions.remoteRepositories=

And then run pull-deps with that extension.

This is the script we use to build a self contained Druid distribution, you might find it useful: https://github.com/implydata/distribution/blob/master/druid/build

Hi Gian,

I cannot run pull-deps at my workplace. I was hoping to use that jars that extensions-repo already has (I think it has all the extension jars). How do I get druid to not go to maven and just use the extension jars from extensions-repo (or my local .m2 repo) ?

~Amol

The extensions-repo in the Druid tarball does not actually contain all the necessary dependency jars. That should be fixed in 0.9 (along with a revamp of the extension system), but in current versions you do need to run pull-deps to actually get everything. You don’t have to do this on the machine you’ll actually run Druid on, so you could run pull-deps on a machine that does have direct Internet access, and then copy the directory to your server.

Thanks Gian and Fangjin,

I used the imply distribution and was able to successfully set it up since it comes with all the needed dependencies.