Failure to query Druid - Faulty channel in resource pool

Hi all!

I have just experienced a failure when querying Druid. One of my broker returned a HTTP 500 because it failed to get results from one historical node.
The exception is coming from org.jboss.netty.channel.ChannelException and says : Faulty channel in resource pool.

I’m wondering if you can help me understand the meaning of this exception? Is there a way to avoid this issue?

I have attached the full stack-trace if you guys want to take a look.

Thanks

Guillaume

broker-exception-stacktrace.txt (9.29 KB)

It looks like the root-cause exception is “org.jboss.netty.channel.ConnectTimeoutException: connection timed out: ec2-54-90-167-235.compute-1.amazonaws.com/10.5.165.53:8080”. In this case (historical node timeout) if it’s a transient error the best thing to do is to retry your query.

Thanks for your reply Gian. I will keep an eye on that and implement a retry mechanism if I get too many of these exceptions.

Hi, I had seen same problem as you.

I had slove this problem. It’s seems that wrong configure of iptables will cause this problem , please check you iptables and make sure that the nodes of druid can communicate each other.

Here’s my configuration of iptables . The file’s location in linux is /etc/sysconfig/iptables.