Druid 0.12.0: LookupCoordinatorManager with java.net.ConnectException: Connection refused

Hi,

from time to time we get an ERROR in LookupCoordinatorManager. For me it seems that the LookupCoordinatorManager tries to
connect to the same HOST and PORT within 2ms, which causes the ConnectException. (marked bold)
2018-09-21T07:03:28,611 INFO [LookupCoordinatorManager–8] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid11.secret:8100
2018-09-21T07:03:28,611 INFO [LookupCoordinatorManager–3] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid12.secret:8083
2018-09-21T07:03:28,611 INFO [LookupCoordinatorManager–0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid12.secret:8100
2018-09-21T07:03:28,611 INFO [LookupCoordinatorManager–1] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid13.secret:8083
2018-09-21T07:03:28,611 INFO [LookupCoordinatorManager–5] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid15.secret:8083
2018-09-21T07:03:28,611 INFO [LookupCoordinatorManager–4] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid16.secret:8083
2018-09-21T07:03:28,611 INFO [LookupCoordinatorManager–9] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid16.secret:8100
2018-09-21T07:03:28,611 INFO [LookupCoordinatorManager–2] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid14.secret:8083
2018-09-21T07:03:28,611 INFO [LookupCoordinatorManager–7] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid11.secret:8083
2018-09-21T07:03:28,613 INFO [LookupCoordinatorManager–3] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid14.secret:8100
2018-09-21T07:03:28,613 INFO [LookupCoordinatorManager–7] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid13.secret:8100
2018-09-21T07:03:28,613 INFO [LookupCoordinatorManager–1] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid44.secret:8082
2018-09-21T07:03:28,614 INFO [LookupCoordinatorManager–7] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid13.secret:8100
2018-09-21T07:03:28,616 ERROR [LookupCoordinatorManager–7] io.druid.server.lookup.cache.LookupCoordinatorManager - Failed to finish lookup management on node [http:druid13.secret:8100]: {class=io.druid.server.lookup.cache.LookupCoordinatorManager, exceptionType=class java.util.concurrent.ExecutionException, exceptionMessage=org.jboss.netty.channel.ChannelException: Faulty channel in resource pool}
at io.druid.server.lookup.cache.LookupCoordinatorManager$LookupsCommunicator.getLookupStateForNode(LookupCoordinatorManager.java:805) ~[druid-server-0.12.0.jar:0.12.0]
at io.druid.server.lookup.cache.LookupCoordinatorManager.doLookupManagementOnNode(LookupCoordinatorManager.java:596) ~[druid-server-0.12.0.jar:0.12.0]
at io.druid.server.lookup.cache.LookupCoordinatorManager.lambda$lookupManagementLoop$2(LookupCoordinatorManager.java:540) ~[druid-server-0.12.0.jar:0.12.0]
at io.druid.server.lookup.cache.LookupCoordinatorManager$LookupsCommunicator.getLookupStateForNode(LookupCoordinatorManager.java:800) ~[druid-server-0.12.0.jar:0.12.0]
2018-09-21T07:03:28,617 INFO [LookupCoordinatorManager–7] io.druid.java.util.emitter.core.LoggingEmitter - Event [{“feed”:“alerts”,“timestamp”:“2018-09-21T07:03:28.617Z”,“service”:“druid/coordinator”,“host”:“druid44.secret:8081”,“version”:“0.12.0”,“severity”:“component-failure”,“description”:“Failed to finish lookup management on node [http:druid13.secret:8100]”,“data”:{“class”:“io.druid.server.lookup.cache.LookupCoordinatorManager”,“exceptionType”:“java.util.concurrent.ExecutionException”,“exceptionMessage”:“org.jboss.netty.channel.ChannelException: Faulty channel in resource pool”,“exceptionStackTrace”:“java.util.concurrent.ExecutionException: org.jboss.netty.channel.ChannelException: Faulty channel in resource pool\n\tat com.google.common.util.concurrent.Futures$ImmediateFailedFuture.get(Futures.java:186)\n\tat io.druid.server.lookup.cache.LookupCoordinatorManager$LookupsCommunicator.getLookupStateForNode(LookupCoordinatorManager.java:805)\n\tat io.druid.server.lookup.cache.LookupCoordinatorManager.doLookupManagementOnNode(LookupCoordinatorManager.java:596)\n\tat io.druid.server.lookup.cache.LookupCoordinatorManager.lambda$lookupManagementLoop$2(LookupCoordinatorManager.java:540)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: org.jboss.netty.channel.ChannelException: Faulty channel in resource pool\n\tat io.druid.java.util.http.client.NettyHttpClient.go(NettyHttpClient.java:147)\n\tat io.druid.server.lookup.cache.LookupCoordinatorManager$LookupsCommunicator.getLookupStateForNode(LookupCoordinatorManager.java:800)\n\t… 10 more\nCaused by: java.net.ConnectException: Connection refused: druid13.secret/10.secret:8100\n\tat sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)\n\tat sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)\n\tat org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)\n\tat org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)\n\tat org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)\n\tat org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)\n\tat org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)\n\tat org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)\n\tat org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)\n\t… 3 more\n”}}]

Is there something not configured properly or is it a bug?

Regards,

Alex

Hi Alex,

we are running currently in the same error from time to time.

Could you resolve the issue?

Greetings,

Stephan

Alex,
This link might be helpful – https://groups.google.com/forum/#!topic/druid-user/-beHFsHzOGw

Thanks,

–siva