Overlord fails when multiple tranquility end points are used

Group,

I have the following use case: Streaming data from 2 data sources (1st source: 12.2k records per second, 2nd source: 8.6k records per sec) and write to different end points.

My cluster is:

3 ZKs

1 Overlord + Coord (running on the same node)

3 Historical + Middle Manager

1 Broker Node

I find that the Overlord breaks with “Netty Channel” or “No Hosts Found” errors after running for about 6 hours.

Please let me know if you require further input. I want suggestions on the cluster configs. I have read the docs and am not very comfortable if the documentation is accurate for complex real time ingestion like ours (Our first stream is a 32 metrics, 86 dimensions schema).

Karthik

Hey Karthik,

Posting the overlord logs would be helpful. Posting the logs of whatever the overlord is attempting to talk to when it throws that exception would also be helpful (i.e. if the exceptions are happening when the overlord is talking to an indexing task, the task log might be interesting).

Hey David,

I had the same error today. I am pasting the error logs (using a different color)

2016-11-07T17:00:28,124 ERROR [qtp1852901506-75] com.sun.jersey.spi.container.ContainerResponse - The RuntimeException could not be mapped to a response, re-throwing to the HTTP container

com.metamx.common.ISE: Unable to grant lock to inactive Task [index_realtime_watchtower_2016-11-06T21:00:00.000Z_0_0]

    at io.druid.indexing.overlord.TaskLockbox.tryLock(TaskLockbox.java:229) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.TaskLockbox.tryLock(TaskLockbox.java:206) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.TaskLockbox.lock(TaskLockbox.java:184) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.common.actions.LockAcquireAction.perform(LockAcquireAction.java:61) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.common.actions.LockAcquireAction.perform(LockAcquireAction.java:31) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.common.actions.LocalTaskActionClient.submit(LocalTaskActionClient.java:64) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource$3.apply(OverlordResource.java:326) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource$3.apply(OverlordResource.java:315) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource.asLeaderWith(OverlordResource.java:658) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource.doAction(OverlordResource.java:312) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source) ~[?:?]

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_111]

    at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_111]

    at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473) [jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419) [jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409) [jersey-server-1.19.jar:1.19]

    at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) [jersey-servlet-1.19.jar:1.19]

    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) [jersey-servlet-1.19.jar:1.19]

    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) [jersey-servlet-1.19.jar:1.19]

    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) [javax.servlet-api-3.1.0.jar:3.1.0]

    at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:278) [guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:268) [guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:180) [guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:93) [guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) [guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120) [guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:132) [guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:129) [guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:206) [guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:129) [guice-servlet-4.0-beta.jar:?]

    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]

    at io.druid.server.http.RedirectFilter.doFilter(RedirectFilter.java:71) [druid-server-0.9.1.1.jar:0.9.1.1]

    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83) [jetty-servlets-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364) [jetty-servlets-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) [jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) [jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.Server.handle(Server.java:497) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) [jetty-io-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:620) [jetty-util-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:540) [jetty-util-9.2.5.v20141112.jar:9.2.5.v20141112]

    at java.lang.Thread.run(Thread.java:745) [?:1.7.0_111]

2016-11-07T17:00:28,126 WARN [qtp1852901506-75] org.eclipse.jetty.servlet.ServletHandler - /druid/indexer/v1/action

com.metamx.common.ISE: Unable to grant lock to inactive Task [index_realtime_watchtower_2016-11-06T21:00:00.000Z_0_0]

    at io.druid.indexing.overlord.TaskLockbox.tryLock(TaskLockbox.java:229) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.TaskLockbox.tryLock(TaskLockbox.java:206) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.TaskLockbox.lock(TaskLockbox.java:184) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.common.actions.LockAcquireAction.perform(LockAcquireAction.java:61) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.common.actions.LockAcquireAction.perform(LockAcquireAction.java:31) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.common.actions.LocalTaskActionClient.submit(LocalTaskActionClient.java:64) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource$3.apply(OverlordResource.java:326) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource$3.apply(OverlordResource.java:315) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource.asLeaderWith(OverlordResource.java:658) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource.doAction(OverlordResource.java:312) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source) ~[?:?]

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_111]

2016-11-07T17:00:28,126 WARN [qtp1852901506-75] org.eclipse.jetty.servlet.ServletHandler - /druid/indexer/v1/action

com.metamx.common.ISE: Unable to grant lock to inactive Task [index_realtime_watchtower_2016-11-06T21:00:00.000Z_0_0]

    at io.druid.indexing.overlord.TaskLockbox.tryLock(TaskLockbox.java:229) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.TaskLockbox.tryLock(TaskLockbox.java:206) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.TaskLockbox.lock(TaskLockbox.java:184) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.common.actions.LockAcquireAction.perform(LockAcquireAction.java:61) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.common.actions.LockAcquireAction.perform(LockAcquireAction.java:31) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.common.actions.LocalTaskActionClient.submit(LocalTaskActionClient.java:64) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource$3.apply(OverlordResource.java:326) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource$3.apply(OverlordResource.java:315) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource.asLeaderWith(OverlordResource.java:658) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at io.druid.indexing.overlord.http.OverlordResource.doAction(OverlordResource.java:312) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]

    at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source) ~[?:?]

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_111]

    at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_111]

    at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409) ~[jersey-server-1.19.jar:1.19]

    at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) ~[jersey-servlet-1.19.jar:1.19]

    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) ~[jersey-servlet-1.19.jar:1.19]

    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) ~[jersey-servlet-1.19.jar:1.19]

    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) ~[javax.servlet-api-3.1.0.jar:3.1.0]

    at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:278) ~[guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:268) ~[guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:180) ~[guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:93) ~[guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120) ~[guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:132) ~[guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:129) ~[guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:206) ~[guice-servlet-4.0-beta.jar:?]

    at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:129) ~[guice-servlet-4.0-beta.jar:?]

    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) ~[jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]

    at io.druid.server.http.RedirectFilter.doFilter(RedirectFilter.java:71) ~[druid-server-0.9.1.1.jar:0.9.1.1]

    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) ~[jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83) ~[jetty-servlets-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364) ~[jetty-servlets-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) ~[jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) [jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) [jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.Server.handle(Server.java:497) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) [jetty-io-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:620) [jetty-util-9.2.5.v20141112.jar:9.2.5.v20141112]

    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:540) [jetty-util-9.2.5.v20141112.jar:9.2.5.v20141112]

    at java.lang.Thread.run(Thread.java:745) [?:1.7.0_111]

The corresponding tranquility error: pretty much around the same time:

com.twitter.finagle.NoBrokersAvailableException: No hosts are available for druidTask!druid:overlord!index_realtime_watchtower_2016-11-07T17:00:00.000Z_0_0, Dtab.base=, Dtab.local=

    at com.twitter.finagle.NoStacktrace(Unknown Source) ~[na:na]

2016-11-07 17:39:36,437 [Hashed wheel timer #1] INFO c.metamx.emitter.core.LoggingEmitter - Event [{“feed”:“alerts”,“timestamp”:“2016-11-07T17:39:36.437Z”,“service”:“tranquility”,“host”:“localhost”,“severity”:“anomaly”,“description”:“Failed to propagate events: druid:overlord/watchtower”,“data”:{“exceptionType”:“com.twitter.finagle.NoBrokersAvailableException”,“exceptionStackTrace”:“com.twitter.finagle.NoBrokersAvailableException: No hosts are available for druidTask!druid:overlord!index_realtime_watchtower_2016-11-07T17:00:00.000Z_0_0, Dtab.base=, Dtab.local=\n\tat com.twitter.finagle.NoStacktrace(Unknown Source)\n”,“timestamp”:“2016-11-07T17:00:00.000Z”,“beams”:“MergingPartitioningBeam(DruidBeam(interval = 2016-11-07T17:00:00.000Z/2016-11-07T18:00:00.000Z, partition = 0, tasks = [index_realtime_watchtower_2016-11-07T17:00:00.000Z_0_0/watchtower-017-0000-0000]))”,“eventCount”:1,“exceptionMessage”:“No hosts are available for druidTask!druid:overlord!index_realtime_watchtower_2016-11-07T17:00:00.000Z_0_0, Dtab.base=, Dtab.local=”}}]

2016-11-07 17:41:09,466 [Hashed wheel timer #1] WARN c.m.tranquility.beam.ClusteredBeam - Emitting alert: [anomaly] Failed to propagate events: druid:overlord/watchtower

{

“eventCount” : 1,

“timestamp” : “2016-11-07T17:00:00.000Z”,

“beams” : “MergingPartitioningBeam(DruidBeam(interval = 2016-11-07T17:00:00.000Z/2016-11-07T18:00:00.000Z, partition = 0, tasks = [index_realtime_watchtower_2016-11-07T17:00:00.000Z_0_0/watchtower-017-0000-0000]))”

}

Please let me know if you require further information. Thank you for responding back!

Karthik

Hey Karthik,

Both of those logs indicate an indexing task which has failed (which caused the lock grant to fail and Tranquility to fail). The root of the issue (why the indexing task is failing) should be in the indexing task log. Can you take a look at that?

David,

I had the issue again. I thought the overlord log is the indexing log. Is there any other log file should I look at. My understanding was that Overlord and Co-ordinator provide indexing logs.

Karthik

Hey Karthik,

The task log is separate from the overlord/coordinator logs. You should be able to see the task logs by going to the overlord console (http://{OVERLORD_IP}:8090 by default) and click on log(all) next to the task. If no logs were saved, check your druid.indexer.logs.type configuration as described here: http://druid.io/docs/0.9.1.1/configuration/indexing-service.html