Druid cluster with tranquility

Hi Team,

im trying to setup druid cluster with tranquility and im able to send data to druid with this error message in tranquility Tasks are creating successfully but some how it is producing this below error and some times im unable to query the real time data from broker node it is showing this error

In middle of overlord running it shows some LOOKUP ERROR no data found.

Tranquility error:-

c.m.tranquility.beam.ClusteredBeam - Emitting alert: [anomaly] Failed to propagate events: druid

:overlord/spyagent

{

“eventCount” : 1,

“timestamp” : “2018-04-30T06:30:00.000Z”,

“beams” : "MergingPartitioningBeam(DruidBeam(interval = 2018-04-30T06:30:00.000Z/2018-04-30T06:45:00.000Z, partition = 0, tasks = [index_realtime_sp

yagent_2018-04-30T06:30:00.000Z_0_0/spyagent-030-0000-0000]))"

}

com.twitter.finagle.NoBrokersAvailableException: No hosts are available for disco!firehose:druid:overlord:spyagent-030-0000-0000, Dtab.base=, Dtab.l

ocal=

at com.twitter.finagle.NoStacktrace(Unknown Source) ~[na:na]

**Broker error for realtime query **

{

“error”: “Unknown exception”,

“errorMessage”: “Failure getting results for query[39422a60-5d31-4e11-8a3a-4f046237a0d1] url[http://192.168.207.14:8101/druid/v2/] because of [org.jboss.netty.channel.ChannelException: Faulty channel in resource pool]”,

“errorClass”: “io.druid.java.util.common.RE”,

“host”: null

}

druid cluster

Node: 1(4cpu_8gb_90g)

zookeeper 3.4.10

Tranquility 0.8.0

historical

middlemanagers

Node: 2(4cpu_8gb_90g)

overlord

coordinator

broker

Tranquility Server Confs

{

“dataSources” : {

“spyagent” : {

“spec” : {

“dataSchema” : {

“dataSource” : “spyagent”,

“parser” : {

“type” : “string”,

“parseSpec” : {

“timestampSpec” : {

“column” : “time”,

“format” : “auto”

},

“dimensionsSpec” : {

“dimensions” : ,

“dimensionExclusions” : [

“time”,

“value”

]

},

“format” : “json”

}

},

“granularitySpec” : {

“type” : “uniform”,

“segmentGranularity” : “fifteen_minute”,

“queryGranularity” : “none”

},

“metricsSpec” : [

{

“type” : “count”,

“name” : “count”

}

]

},

“ioConfig” : {

“type” : “realtime”

},

“tuningConfig” : {

“type” : “realtime”,

“maxRowsInMemory” : “1000000”,

“intermediatePersistPeriod” : “PT1M”,

“windowPeriod” : “PT1M”

}

},

“properties” : {

“task.partitions” : “1”,

“task.replicants” : “1”,

}

}

},

“properties” : {

“zookeeper.connect” : “192.168.207.14”,

“druid.discovery.curator.path” : “/druid/discovery”,

“druid.selectors.indexing.serviceName” : “druid/overlord”,

“http.port” : “8200”,

“http.threads” : “9”

}

}

Historical Node runtime properties Historical JVM.conf

druid.host=192.168.207.14 -XMS 1g

druid.service=druid/historical -XMx 1g

druid.port=8083 -XX:MaxDirectMemorySize=1280m

druid.server.http.numThreads=9

druid.processing.buffer.sizeBytes=256000000

druid.processing.numThreads=2

druid.segmentCache.locations=[{“path”:“var/druid/segment-cache”,“maxSize”:26843545600}]

druid.server.maxSize=26843545600

Middlemanager Node runtime properties Middlemanager JVM.conf

druid.host=192.168.207.14 -server -Xms64m -Xmx64m -Duser.timezone=UTC

druid.service=druid/middleManager

druid.port=8091

druid.worker.capacity=25

druid.indexer.runner.javaOpts=-server -Xmx2g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

druid.indexer.task.baseTaskDir=var/druid/task

druid.server.http.numThreads=9

druid.indexer.fork.property.druid.processing.buffer.sizeBytes=256000000

druid.indexer.fork.property.druid.processing.numThreads=2

druid.indexer.task.hadoopWorkingPath=var/druid/hadoop-tmp

druid.indexer.task.defaultHadoopCoordinates=[“org.apache.hadoop:hadoop-client:2.7.3”]

Overlord Node runtime properties OverlordJVM.conf

druid.host=192.168.207.16 -XMS 256m

druid.service=druid/overlord -XMx 256m

druid.port=8090 -Duser.timezone=UTC

druid.indexer.queue.startDelay=PT5S

druid.indexer.runner.type=remote

druid.indexer.storage.type=metadata

Coordinator****Node runtime properties Coordinator JVM.conf

druid.host=192.168.207.16 -XMS 256m

druid.service=druid/coordinator -XMx 256m

druid.port=8081 -Duser.timezone=UTC

druid.indexer.queue.startDelay=PT10S

druid.coordinator.period=PT5S

Broker****Node runtime properties Broker JVM.conf

druid.host=192.168.207.16 -XMS 1g

druid.service=druid/coordinator -XMx 1g

druid.port=8081 -XX:MaxDirectMemorySize=1792m

druid.broker.http.numConnections=5

druid.server.http.numThreads=9

druid.processing.buffer.sizeBytes=256000000

druid.processing.numThreads=2

druid.broker.cache.useCache=true

druid.broker.cache.populateCache=true

druid.cache.type=local

druid.cache.sizeInBytes=10000000

Common runtime properties

druid.extensions.loadList=

In my experience, your error:

{

“error”: “Unknown exception”,

“errorMessage”: “Failure getting results for query[39422a60-5d31-4e11-8a3a-4f046237a0d1] url[http://192.168.207.14:8101/druid/v2/] because of [org.jboss.netty.channel.ChannelException: Faulty channel in resource pool]”,

“errorClass”: “io.druid.java.util.common.RE”,

“host”: null

}

Means that the referenced URL (http://192.168.207.14:8101/druid/v2/ - by the port number I would assume this to be your coordinator node) is not reachable from some other node.

When it happened to me, it was actually my historicals that were not reachable (I think by my brokers). Fixing my networking to ensure that the URL was resolveable from my brokers fixed this for me.

In your case, I would suggest SSH’ing into your broker/historical nodes and trying to curl http://192.168.207.14:8101/druid/v2/ to see if your historical/broker nodes can talk to your coordinator.

Hi,
MAtt Dantas thnq for ur response
ya i tried that too i opend port 8101 on historical node as well as broker node still im facing same problem
and as u said i tried curl http://192.168.207.14:8101/druid/v2/ on historical/broker node
histroical/broker are not producing any error logs/response is there any other solution…?
if dont mind can u suggest all services runtime properties for my hardware

thnx
regards,

sai

Hey there,

I’ve seen Tranquility give those kind of errors when either there’s no capacity to create an indexing task or tranquility’s already created the task for the current interval but it has failed.

Can you check the Overlord’s console and ensure the realtime tasks there are running? If not their logs might give some insight into what is going wrong.

Best regards,

Dylan