Not able to replace ZooKeeper with K8S Cluster

Hello Team,

I am a newbie to Druid. Due to some recurring issue with the ZK with druid, I am trying the option mentioned in the below link to replace/eliminate ZK dependencies with K8S cluster.

https://druid.apache.org/docs/latest/development/extensions-core/kubernetes.html

Below are the config changes i have made in the coordinator service:

druid.zk.service.enabled=false
druid.serverview.type=http
druid.coordinator.loadqueuepeon.type=http
druid.indexer.runner.type=httpRemote
druid.discovery.type=k8s

druid.discovery.k8s.clusterIdentifier=druid-staging
druid.discovery.k8s.podNameEnvKey=POD_NAME
druid.discovery.k8s.podNamespaceEnvKey=POD_NAMESPACE

Below is the exception thrown:

Exception in thread “main” java.lang.RuntimeException: com.google.inject.CreationException: Unable to create injector, see the following errors:

  1. Unknown provider[k8s] of Key[type=org.apache.druid.discovery.DruidNodeAnnouncer, annotation=[none]], known options[[curator]]
    at org.apache.druid.guice.PolyBind.createChoiceWithDefault(PolyBind.java:109) (via modules: com.google.inject.util.Modules$OverrideModule → com.google.inject.util.Modules$OverrideModule → org.apache.druid.curator.discovery.DiscoveryModule)
    while locating org.apache.druid.discovery.DruidNodeAnnouncer
    for field at org.apache.druid.cli.ServerRunnable$DiscoverySideEffectsProvider.announcer(ServerRunnable.java:221)
    at org.apache.druid.cli.ServerRunnable.bindAnnouncer(ServerRunnable.java:118) (via modules: com.google.inject.util.Modules$OverrideModule → com.google.inject.util.Modules$OverrideModule → org.apache.druid.cli.CliCoordinator$1)

  2. Unknown provider[k8s] of Key[type=org.apache.druid.discovery.DruidNodeAnnouncer, annotation=[none]], known options[[curator]]
    at org.apache.druid.guice.PolyBind.createChoiceWithDefault(PolyBind.java:109) (via modules: com.google.inject.util.Modules$OverrideModule → com.google.inject.util.Modules$OverrideModule → org.apache.druid.curator.discovery.DiscoveryModule)
    while locating org.apache.druid.discovery.DruidNodeAnnouncer
    for field at org.apache.druid.cli.ServerRunnable$DiscoverySideEffectsProvider.announcer(ServerRunnable.java:221)
    at org.apache.druid.cli.ServerRunnable.bindAnnouncer(ServerRunnable.java:118) (via modules: com.google.inject.util.Modules$OverrideModule → com.google.inject.util.Modules$OverrideModule → org.apache.druid.cli.CliOverlord$1)

2 errors
at org.apache.druid.cli.GuiceRunnable.makeInjector(GuiceRunnable.java:72)
at org.apache.druid.cli.ServerRunnable.run(ServerRunnable.java:62)
at org.apache.druid.cli.Main.main(Main.java:113)
Caused by: com.google.inject.CreationException: Unable to create injector, see the following errors:

  1. Unknown provider[k8s] of Key[type=org.apache.druid.discovery.DruidNodeAnnouncer, annotation=[none]], known options[[curator]]
    at org.apache.druid.guice.PolyBind.createChoiceWithDefault(PolyBind.java:109) (via modules: com.google.inject.util.Modules$OverrideModule → com.google.inject.util.Modules$OverrideModule → org.apache.druid.curator.discovery.DiscoveryModule)
    while locating org.apache.druid.discovery.DruidNodeAnnouncer
    for field at org.apache.druid.cli.ServerRunnable$DiscoverySideEffectsProvider.announcer(ServerRunnable.java:221)
    at org.apache.druid.cli.ServerRunnable.bindAnnouncer(ServerRunnable.java:118) (via modules: com.google.inject.util.Modules$OverrideModule → com.google.inject.util.Modules$OverrideModule → org.apache.druid.cli.CliCoordinator$1)

  2. Unknown provider[k8s] of Key[type=org.apache.druid.discovery.DruidNodeAnnouncer, annotation=[none]], known options[[curator]]
    at org.apache.druid.guice.PolyBind.createChoiceWithDefault(PolyBind.java:109) (via modules: com.google.inject.util.Modules$OverrideModule → com.google.inject.util.Modules$OverrideModule → org.apache.druid.curator.discovery.DiscoveryModule)
    while locating org.apache.druid.discovery.DruidNodeAnnouncer
    for field at org.apache.druid.cli.ServerRunnable$DiscoverySideEffectsProvider.announcer(ServerRunnable.java:221)
    at org.apache.druid.cli.ServerRunnable.bindAnnouncer(ServerRunnable.java:118) (via modules: com.google.inject.util.Modules$OverrideModule → com.google.inject.util.Modules$OverrideModule → org.apache.druid.cli.CliOverlord$1)

2 errors
at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:470)
at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:176)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:110)
at com.google.inject.Guice.createInjector(Guice.java:99)
at com.google.inject.Guice.createInjector(Guice.java:73)
at com.google.inject.Guice.createInjector(Guice.java:62)
at org.apache.druid.initialization.Initialization.makeInjectorWithModules(Initialization.java:433)
at org.apache.druid.cli.GuiceRunnable.makeInjector(GuiceRunnable.java:69)

Request you to kindly help me on this. Let me know if any other info needs to be shared.

Thanks,
Keerthi Kumar N

Hi Keerthi Kumar,

I have not tried this extension, but it seems like it’s not being loaded?

Have you added druid-kubernetes-extensions to the druid.extensions.loadList?

Kyle

Hello @Kyle_Hoondert ,

Thanks for your response. As suggested by you, I had missed to include the druid-kubernetes-extension earlier. However, I have included the same now and below is the error message is shown in the coordinator service logs.

Caused by: java.lang.IllegalArgumentException: Cannot construct instance of org.apache.druid.k8s.discovery.K8sDiscoveryConfig, problem: null/empty clusterIdentifier
at [Source: UNKNOWN; line: -1, column: -1]
at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:3938)
at com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:3869)
at org.apache.druid.guice.JsonConfigurator.configurate(JsonConfigurator.java:119)
at org.apache.druid.guice.JsonConfigProvider.get(JsonConfigProvider.java:243)
at org.apache.druid.guice.JsonConfigProvider.get(JsonConfigProvider.java:81)
at com.google.inject.internal.ProviderInternalFactory.provision(ProviderInternalFactory.java:81)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.provision(InternalFactoryToInitializableAdapter.java:53)
at com.google.inject.internal.ProviderInternalFactory.circularGet(ProviderInternalFactory.java:61)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.get(InternalFactoryToInitializableAdapter.java:45)
at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:194)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)
at com.google.inject.internal.InjectorImpl$2$1.call(InjectorImpl.java:1019)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
at com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1015)
at org.apache.druid.guice.SupplierProvider.get(SupplierProvider.java:52)
at com.google.inject.internal.ProviderInternalFactory.provision(ProviderInternalFactory.java:81)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.provision(InternalFactoryToInitializableAdapter.java:53)
at com.google.inject.internal.ProviderInternalFactory.circularGet(ProviderInternalFactory.java:61)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.get(InternalFactoryToInitializableAdapter.java:45)
at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:38)
at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:62)
at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:110)
at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:90)
at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:268)
at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:194)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)
at com.google.inject.internal.SingleFieldInjector.inject(SingleFieldInjector.java:54)
at com.google.inject.internal.MembersInjectorImpl.injectMembers(MembersInjectorImpl.java:132)
at com.google.inject.internal.MembersInjectorImpl$1.call(MembersInjectorImpl.java:93)
at com.google.inject.internal.MembersInjectorImpl$1.call(MembersInjectorImpl.java:80)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1085)
at com.google.inject.internal.MembersInjectorImpl.injectAndNotify(MembersInjectorImpl.java:80)
at com.google.inject.internal.Initializer$InjectableReference.get(Initializer.java:223)
at com.google.inject.internal.Initializer.injectAll(Initializer.java:132)
at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:174)
… 8 more
Caused by: com.fasterxml.jackson.databind.exc.ValueInstantiationException: Cannot construct instance of org.apache.druid.k8s.discovery.K8sDiscoveryConfig, problem: null/empty clusterIdentifier
at [Source: UNKNOWN; line: -1, column: -1]
at com.fasterxml.jackson.databind.exc.ValueInstantiationException.from(ValueInstantiationException.java:47)
at com.fasterxml.jackson.databind.DeserializationContext.instantiationException(DeserializationContext.java:1735)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapAsJsonMappingException(StdValueInstantiator.java:491)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.rewrapCtorProblem(StdValueInstantiator.java:514)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:285)
at com.fasterxml.jackson.databind.deser.ValueInstantiator.createFromObjectWith(ValueInstantiator.java:229)
at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:198)
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:488)
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1292)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:326)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:159)
at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:3933)
… 48 more
Caused by: java.lang.IllegalArgumentException: null/empty clusterIdentifier
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
at org.apache.druid.k8s.discovery.K8sDiscoveryConfig.(K8sDiscoveryConfig.java:75)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:124)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:283)
… 55 more

Hello @Kyle_Hoondert … any idea on the above exception?

I’m wondering where you included druid.discovery.k8s.clusterIdentifier=druid-staging - did you only include it in the coordinator runtime.properties, or is it in the common.runtime.properties

I think it needs to be in common.runtime.properties so that it is set on each pod.

Hello @Kyle_Hoondert ,

Thanks a ton for your help. I am able to successfully configure the required things in order to reduce/eliminate/replace the ZooKeeper dependency in the druid with use of K8S cluster. However, would like to know the below details, kindly help me on the same.

  • What is the percentage of ZK dependency will be reduced/replaced/avoided with this configuration?

  • Apart from checking the entries in the log file of druid services, is there any other way to ensure the druid process is not using ZK but instead making use of K8S cluster?

This information will really helpful to analyze further for few more config changes and may be for some optimization as well

Thanks,
Keerthi Kumar N

Hi @keerthikumar - glad you’re able to get things going!

I’m not the right person to answer any questions about removing ZooKeeper using K8s. As I mentioned in my first post I have never tried this extension.

Druid uses ZK for:

  • Service discovery (apparently can be replaced with k8s in this extension)
  • Task management communication (can be replaced by setting druid.indexer.runner.type to httpRemote)
  • Segment discovery (can be replaced by setting druid.serverview.type to http)
  • Segment allocation (can be replaced by setting druid.coordinator.loadqueuepeon.type to http)

You may need to reach out to the contributors of the extension for further answers on this.

Kyle

There’s this video Under-The-Hood of Druid without Zookeeper on Kubernetes - Himanshu Gupta - YouTube with a deep dive into how the extension works

However, I recommend you to not use it in production if your k8s cluster is not the same as of the one it was developed!
I tried it for a few day with druid 0.22.1 and k8s 1.21.11 and the services starting to have some null pointer exceptions related to this extension. Some field may have moved or changed and this extension was not updated.

Hello @Renato_Cron,

Thanks for your message. However, after configuring the below entries in the common config properties file, getting the exception as shown below:

druid.discovery.k8s.clusterIdentifier=druid-develop
druid.zk.service.enabled=false
**druid.discovery.type=k8s **
druid.serverview.type=http

Caused by: java.lang.IllegalArgumentException: clusterIdentifier[druid-develop druid.zk.service.enabled=false druid.discovery.type=k8s druid.serverview.type=http] is used in k8s resource name and must match regex[[a-z0-9][a-z0-9-]*[a-z0-9]]

Request you to kindly help me on this.

Thanks,
Keerthi Kumar N

Hello @Renato_Cron , @Kyle_Hoondert,

I have also included the druid-kubernetes-extensions in the common config properties too.