RE: Failure getting results for query

ChannelException is thrown frequently, but sometimes it’s okay

i have check some revelent answers, but not solved my case

https://groups.google.com/forum/#!msg/druid-user/-beHFsHzOGw/9gfHaNPLBAAJ

https://groups.google.com/forum/m/#!msg/druid-development/sI-urNJLzcs/TPtlAHo3DAAJ

error stack is below:

2018-08-06T13:02:30,484 WARN [HttpClient-Netty-Boss-0] org.jboss.netty.channel.SimpleChannelUpstreamHandler - EXCEPTION, please implement org.jboss.netty.handler.codec.http.HttpContentDecompressor.exceptionCaught() for proper handling.

java.net.ConnectException: Connection refused: mdrd15st.prod.test.com/172.21.9.157:8103

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_102]

at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_102]

at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152) ~[netty-3.10.6.Final.jar:?]

at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105) [netty-3.10.6.Final.jar:?]

at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79) [netty-3.10.6.Final.jar:?]

at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) [netty-3.10.6.Final.jar:?]

at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) [netty-3.10.6.Final.jar:?]

at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.10.6.Final.jar:?]

at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [netty-3.10.6.Final.jar:?]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_102]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_102]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]

2018-08-06T13:02:30,485 WARN [DruidSchema-Cache-0] io.druid.sql.calcite.schema.DruidSchema - Metadata refresh failed, trying again soon.

io.druid.java.util.common.RE: Failure getting results for query[267b4ea1-c214-408c-93ac-2954ba747e02] url[http://mdrd15st.prod.test.com:8103/druid/v2/] because of [org.jboss.netty.channel.ChannelException: Faulty channel in resource pool]

at io.druid.client.DirectDruidClient$JsonParserIterator.init(DirectDruidClient.java:640) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.client.DirectDruidClient$JsonParserIterator.hasNext(DirectDruidClient.java:572) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.BaseSequence.makeYielder(BaseSequence.java:88) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.BaseSequence.toYielder(BaseSequence.java:68) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.MappedSequence.toYielder(MappedSequence.java:49) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.MergeSequence$2.accumulate(MergeSequence.java:70) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.MergeSequence$2.accumulate(MergeSequence.java:66) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.BaseSequence.accumulate(BaseSequence.java:46) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.MergeSequence.toYielder(MergeSequence.java:63) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.LazySequence.toYielder(LazySequence.java:46) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.query.RetryQueryRunner$1.toYielder(RetryQueryRunner.java:108) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.common.guava.CombiningSequence.toYielder(CombiningSequence.java:80) ~[druid-common-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.MappedSequence.toYielder(MappedSequence.java:49) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.MappedSequence.toYielder(MappedSequence.java:49) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:87) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:83) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.query.CPUTimeMetricQueryRunner$1.wrap(CPUTimeMetricQueryRunner.java:74) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.WrappingSequence.toYielder(WrappingSequence.java:82) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:87) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:83) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.WrappingSequence.toYielder(WrappingSequence.java:82) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.java.util.common.guava.Yielders.each(Yielders.java:32) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.sql.calcite.schema.DruidSchema.refreshSegmentsForDataSource(DruidSchema.java:405) ~[druid-sql-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.sql.calcite.schema.DruidSchema.refreshSegments(DruidSchema.java:372) ~[druid-sql-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.sql.calcite.schema.DruidSchema.access$1000(DruidSchema.java:79) ~[druid-sql-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.sql.calcite.schema.DruidSchema$2.run(DruidSchema.java:210) [druid-sql-0.10.1-iap3.jar:0.10.1-iap3]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_102]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_102]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_102]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_102]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_102]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_102]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]

Caused by: java.util.concurrent.ExecutionException: org.jboss.netty.channel.ChannelException: Faulty channel in resource pool

at com.google.common.util.concurrent.Futures$ImmediateFailedFuture.get(Futures.java:186) ~[guava-16.0.1.jar:?]

at io.druid.client.DirectDruidClient$JsonParserIterator.init(DirectDruidClient.java:610) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]

… 33 more

Caused by: org.jboss.netty.channel.ChannelException: Faulty channel in resource pool

at com.metamx.http.client.NettyHttpClient.go(NettyHttpClient.java:140) ~[http-client-1.1.0.jar:?]

at io.druid.client.DirectDruidClient.run(DirectDruidClient.java:452) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.client.CachingClusteredClient$5.addSequencesFromServer(CachingClusteredClient.java:503) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.client.CachingClusteredClient$5.get(CachingClusteredClient.java:433) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]

at io.druid.client.CachingClusteredClient$5.get(CachingClusteredClient.java:427) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]

… 25 more

Caused by: java.net.ConnectException: Connection refused: mdrd15st.prod.test.com/172.21.9.157:8103

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_102]

at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_102]

at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152) ~[netty-3.10.6.Final.jar:?]

at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105) ~[netty-3.10.6.Final.jar:?]

at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79) ~[netty-3.10.6.Final.jar:?]

at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) ~[netty-3.10.6.Final.jar:?]

at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) ~[netty-3.10.6.Final.jar:?]

at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) ~[netty-3.10.6.Final.jar:?]

at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) ~[netty-3.10.6.Final.jar:?]

… 3 more

I have the same issue, have you fixed that?

Looked like your worker mdrd15st.prod.test.com/172.21.9.157:8103 was not reachable. Is there any error or warning in the historical log on this node? Could there be any hardware or networking related issue? Is this mdrd15st.prod.test.com/172.21.9.157:8103 the only historical node getting complained among the rest in your query logs?

Caused by: java.net.ConnectException: Connection refused: mdrd15st.prod.test.com/172.21.9.157:8103

8103 looks like a peon, you could try increasing druid.server.http.numThreads on the peons in druid.indexer.runner.javaOpts in case this is an issue with the peons not having enough threads to service queries in a timely manner

If the connection failures consistently happen on one machine, that might indicate a problem with that specific machine.