HTTP PostEmitter - Timing out emitter batch send

Hi,

I frequently see the following error with the HTTPPostEmitter module. Appreciate any pointers to tune the configurations. Find below the configurations. Thanks.

Druid Version: 0.12.3

Emitter configuration:

druid.emitter=parametrized

druid.emitter.parametrized.httpEmitting.flushMillis=60000

druid.emitter.parametrized.httpEmitting.flushCount=100

druid.emitter.parametrized.httpEmitting.minHttpTimeoutMillis=100

Configs From the logs

HttpPostEmitter{config=HttpEmitterConfig{flushMillis=60000, flushCount=100, flushTimeOut=9223372036854775807, basicAuthentication=‘null’, batchingStrategy=ARRAY, maxBatchSize=3198156, contentEncoding=null, batchQueueSizeLimit=2, httpTimeoutAllowanceFactor=2.0, minHttpTimeoutMillis=0, recipientBaseUrl=‘http://10.10.10.10:7789/bfn-druid-metrics/metrics’}}]

ERROR seen:

2018-10-25T20:38:06,844 ERROR [HttpPostEmitter-1] io.druid.java.util.emitter.core.HttpPostEmitter - Timing out emitter batch send, last batch fill time [59,992] ms, timeout [119,984] ms

2018-10-25T20:38:06,846 WARN [HttpPostEmitter-1] io.druid.java.util.common.RetryUtils - Failed on try 1, retrying in 835ms.

java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Read timeout to /10.10.10.10:7789 after 60000 ms

at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[?:1.8.0_181]

at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) ~[?:1.8.0_181]

at org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:172) ~[async-http-client-2.0.37.jar:?]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.send(HttpPostEmitter.java:763) ~[java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.access$1900(HttpPostEmitter.java:467) ~[java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread$1.call(HttpPostEmitter.java:675) ~[java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread$1.call(HttpPostEmitter.java:671) ~[java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.common.RetryUtils.retry(RetryUtils.java:63) [java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.common.RetryUtils.retry(RetryUtils.java:81) [java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.sendWithRetries(HttpPostEmitter.java:669) [java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.emit(HttpPostEmitter.java:577) [java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.emitBatches(HttpPostEmitter.java:553) [java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.run(HttpPostEmitter.java:506) [java-util-0.12.3.jar:0.12.3]

Caused by: java.util.concurrent.TimeoutException: Read timeout to /10.12.34.5:7789 after 60000 ms

at org.asynchttpclient.netty.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:43) ~[async-http-client-2.0.37.jar:?]

at org.asynchttpclient.netty.timeout.ReadTimeoutTimerTask.run(ReadTimeoutTimerTask.java:54) ~[async-http-client-2.0.37.jar:?]

at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:663) ~[netty-all-4.0.52.Final.jar:4.0.52.Final]

at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:738) ~[netty-all-4.0.52.Final.jar:4.0.52.Final]

at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:466) ~[netty-all-4.0.52.Final.jar:4.0.52.Final]

at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]

2018-10-25T20:39:07,741 ERROR [HttpPostEmitter-1] io.druid.java.util.emitter.core.HttpPostEmitter - Timing out emitter batch send, last batch fill time [59,992] ms, timeout [119,984] ms

2018-10-25T20:39:07,743 ERROR [HttpPostEmitter-1] io.druid.java.util.emitter.core.HttpPostEmitter - Failed to send events to url[http://10.10.10.10:7789/bfn-druid-metrics/metrics]

java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Read timeout to /10.10.10.10:7789 after 60000 ms

at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[?:1.8.0_181]

at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) ~[?:1.8.0_181]

at org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:172) ~[async-http-client-2.0.37.jar:?]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.send(HttpPostEmitter.java:763) ~[java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.access$1900(HttpPostEmitter.java:467) ~[java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread$1.call(HttpPostEmitter.java:675) ~[java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread$1.call(HttpPostEmitter.java:671) ~[java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.common.RetryUtils.retry(RetryUtils.java:63) ~[java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.common.RetryUtils.retry(RetryUtils.java:81) ~[java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.sendWithRetries(HttpPostEmitter.java:669) [java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.emit(HttpPostEmitter.java:577) [java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.emitBatches(HttpPostEmitter.java:553) [java-util-0.12.3.jar:0.12.3]

at io.druid.java.util.emitter.core.HttpPostEmitter$EmittingThread.run(HttpPostEmitter.java:506) [java-util-0.12.3.jar:0.12.3]

Caused by: java.util.concurrent.TimeoutException: Read timeout to /10.10.10.10:7789 after 60000 ms

at org.asynchttpclient.netty.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:43) ~[async-http-client-2.0.37.jar:?]

at org.asynchttpclient.netty.timeout.ReadTimeoutTimerTask.run(ReadTimeoutTimerTask.java:54) ~[async-http-client-2.0.37.jar:?]

at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:663) ~[netty-all-4.0.52.Final.jar:4.0.52.Final]

at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:738) ~[netty-all-4.0.52.Final.jar:4.0.52.Final]

at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:466) ~[netty-all-4.0.52.Final.jar:4.0.52.Final]

at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]

Thank you,

Avinash

I have same problem.

在 2018年10月26日星期五 UTC+8上午4:57:55,ahe…@barefootnetworks.com写道:

Hey ,

I am also facing the same issue with druid historicals.

Please let me know if you could figure it out.

It seems that whatever you are emitting your metrics to isn’t able to process receiving them in 60 seconds. Which is a very long time, so you should check error logs and memory configuration of whatever is receving the metrics