Druid Router console : Often getting Faulty channel in resource pool error

Hi there,

I just finished to build a new Druid cluster (0.14.0-incubating) on AWS and activated the new router Console.

Here is my cluster setup :

6 nodes :

1 broker, that also handle the router

1 overlord

1 coordinator

2 historical

1 middleManager

For now, I haven’t so many data in the cluster, I just added the wikipedia datasource, from quickstart tutorial

And it works almost perfectly.

By almost, I mean that Router console sometime gives me an error when I load Datasources or Segments screens.

It kind of freezes for 20s (it’s maybe loading) and throws :

Unknown exception / Failure getting results for query[null] url[http://:8081/druid/coordinator/v1/metadata/segments]
because of [org.jboss.netty.channel.ChannelException: Faulty channel in
resource pool] / org.apache.druid.java.util.common.RE

The issue is that it doesn’t appear in logs…

Moreover I just noticed this strange behaviour :

This issue happens only when I switch from datasource to segments or the way back.

When I fully reload the page (F5), it works perfectly, in less than 2s.

How can that happen ? Am I missing some parameters I should set ?

Thanks a lot for your help


Auto answer, I think I found the root cause.

My cluster was sharing its zookeeper with another cluster and apparently it doesn’t like it.

I created a new zookeeper cluster dedicated to my druid cluster and everything works really fine : no more errors !