Druid 0.9.0 overload throw exception when tranquility communicate with it

Hi

**Hi **

**I met a strange problem. When I start the tranquility for getting message from kafka, It shows an error. **

2016-04-29 08:46:30,813 [finagle/netty3-1] WARN c.m.tranquility.finagle.FutureRetry$ - Transient error, will try again in 13,860 ms

com.metamx.tranquility.druid.IndexServiceTransientException: Service[druid:overlord] call failed with status: 500 Internal Server Error

at com.metamx.tranquility.druid.IndexService$$anonfun$call$1$$anonfun$apply$17.apply(IndexService.scala:150) ~[io.druid.tranquility-core-0.7.4.jar:0.7.4]

at com.metamx.tranquility.druid.IndexService$$anonfun$call$1$$anonfun$apply$17.apply(IndexService.scala:132) ~[io.druid.tranquility-core-0.7.4.jar:0.7.4]

at com.twitter.util.Future$$anonfun$map$1$$anonfun$apply$6.apply(Future.scala:950) ~[com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Try$.apply(Try.scala:13) ~[com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Future$.apply(Future.scala:97) ~[com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

Then I check the overlord node log, it shows as follow. I guess the root cause is the overlord node cann’t find the class RealtimeTuningConfig:

2016-04-29T07:35:52,487 WARN [qtp1507315859-36] org.eclipse.jetty.servlet.ServletHandler - Error for /druid/indexer/v1/task

java.lang.NoClassDefFoundError: Could not initialize class io.druid.segment.indexing.RealtimeTuningConfig

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.7.0_79]

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) ~[?:1.7.0_79]

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.7.0_79]

at java.lang.reflect.Constructor.newInstance(Constructor.java:526) ~[?:1.7.0_79]

But when I start the overlord node, I have added the classpath for the library.

nohup java cat conf/druid/overlord/jvm.config | xargs -classpath conf/druid/_common:conf/druid/overlord:lib/* io.druid.cli.Main server overlord 1>>/tmp/druid-logs/overlord.log 2>&1 &

**the log also shows it: **

2016-04-29T07:32:22,716 INFO [main] org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=conf/druid/_common:conf/druid/overlord:lib/log4j-1.2-api-2.5.jar:lib/druid-server-0.9.0.jar

**When I start the realtime node, it can ingest the sample data from kafka successfully. **

When I ingest the batch data from hdfs, the overload can run successfully.

Only when I try to start tranquility , I found there is a exception in overlord.

Could you give me some suggestion?

tranquility configuration file is kafka.json

common configuration fiel is common.runtime.properties

overlord configuration file is runtime.properties

tranquility’s log is tranquility.log

overlord’s log is overlord.log

Thanks

overlord.log (1.97 MB)

tranquility.log (418 KB)

kafka.json (2.03 KB)

common.runtime.properties (3.86 KB)

runtime.properties (153 Bytes)

Hi
I open a debug flag when jvm starts the overlord, I can see the RealtimeTuningConfig was loaded, but it can not be initialized. ??

Hey Gary,

This error is really cryptic but I think the most common cause is that your java.io.tmpdir does not exist or is not writeable. Could you double-check that?

Hi Gian,
Thank you for the quick reply. I check the java.io.tmpdir, it is var/tmp, which doesn’t exist. I create it manually, then that error disappears. But if I delete the directory, overlord prints the error again. So I think it works now.

You help me a lot. Gian. :slight_smile:

Gary

Hey guys,

I have started up a Tranquilty server. Having set the WindowPeriod to 20m I get no error just this {“result”:{“received”:7681,“sent”:0}} and nothing goes to druid. When I set the WindowPeriod to be 2d I keep getting the same error.

com.metamx.tranquility.druid.IndexServiceTransientException: Service[druid:overlord] call failed with status: 500 Internal Server Error

at com.metamx.tranquility.druid.IndexService$$anonfun$call$1$$anonfun$apply$17.apply(IndexService.scala:150) ~[io.druid.tranquility-core-0.7.4.jar:0.7.4]

at com.metamx.tranquility.druid.IndexService$$anonfun$call$1$$anonfun$apply$17.apply(IndexService.scala:132) ~[io.druid.tranquility-core-0.7.4.jar:0.7.4]

at com.twitter.util.Future$$anonfun$map$1$$anonfun$apply$6.apply(Future.scala:950) ~[com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Try$.apply(Try.scala:13) ~[com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Future$.apply(Future.scala:97) ~[com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Future$$anonfun$map$1.apply(Future.scala:950) ~[com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Future$$anonfun$map$1.apply(Future.scala:949) ~[com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Promise$Transformer.liftedTree1$1(Promise.scala:112) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Promise$Transformer.k(Promise.scala:112) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Promise$Transformer.apply(Promise.scala:122) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Promise$Transformer.apply(Promise.scala:103) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Promise$$anon$1.run(Promise.scala:366) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.concurrent.LocalScheduler$Activation.run(Scheduler.scala:178) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.concurrent.LocalScheduler$Activation.submit(Scheduler.scala:136) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.concurrent.LocalScheduler.submit(Scheduler.scala:207) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.concurrent.Scheduler$.submit(Scheduler.scala:92) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Promise.runq(Promise.scala:350) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Promise.updateIfEmpty(Promise.scala:721) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Promise.update(Promise.scala:694) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.util.Promise.setValue(Promise.scala:670) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.concurrent.AsyncQueue.offer(AsyncQueue.scala:111) [com.twitter.util-core_2.11-6.30.0.jar:6.30.0]

at com.twitter.finagle.netty3.transport.ChannelTransport.handleUpstream(ChannelTransport.scala:55) [com.twitter.finagle-core_2.11-6.31.0.jar:6.31.0]

at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.handler.codec.http.HttpContentDecoder.messageReceived(HttpContentDecoder.java:108) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.handler.codec.http.HttpClientCodec.handleUpstream(HttpClientCodec.java:92) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.SimpleChannelHandler.messageReceived(SimpleChannelHandler.java:142) [io.netty.netty-3.10.5.Final.jar:na]

at com.twitter.finagle.netty3.channel.ChannelStatsHandler.messageReceived(ChannelStatsHandler.scala:78) [com.twitter.finagle-core_2.11-6.31.0.jar:6.31.0]

at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.SimpleChannelHandler.messageReceived(SimpleChannelHandler.java:142) [io.netty.netty-3.10.5.Final.jar:na]

at com.twitter.finagle.netty3.channel.ChannelRequestStatsHandler.messageReceived(ChannelRequestStatsHandler.scala:35) [com.twitter.finagle-core_2.11-6.31.0.jar:6.31.0]

at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [io.netty.netty-3.10.5.Final.jar:na]

at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [io.netty.netty-3.10.5.Final.jar:na]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_73]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_73]

at java.lang.Thread.run(Thread.java:745) [na:1.8.0_73]

I want to accept realtime events but some have older timestamps. Can you please help?

George

Do you see any exceptions in your overlord logs? One guess is that you might have to specify windowPeriod as minutes or hours (like PT48H instead of P2D).

That being said, windowPeriods this long will work, but they are not really typical in Druid clusters because usually people want data to be handed off to historicals more often. segmentGranularity should be DAY with this setup. See https://github.com/druid-io/tranquility/blob/master/docs/overview.md#segment-granularity-and-window-period for some discussion about how these configs work together.

As I saw minutes was used I tried that out as well setting it at P3000M. Again same result, tranquility server started up without errors and got the same error when trying to feed data.

I don’t get any exceptions at overlord.log

Thank you,

George

There needs to be a T in there; PT3000M rather than P3000M (the latter is interpreted as months). If that doesn’t work either, perhaps try setting your log level to trace or debug so the full request/response pair is logged. That might have some further clues.

hi, I also meet the same problem, your database must be UTF-8, check your overlord log when start overlord node, you will see the error prompt about the database!

在 2016年4月29日星期五 UTC+8下午5:59:16,Gary Wu写道: