realtime node fails during startup

Can anyone help on solving this problem.

Hi Ram, have you set the property “druid.realtime.specFile”?

Hi, Ram. I have encountered this problem before and I suggest you check your shell of starting realtime node.

Hi Gian,
Thanks for the pointer.

It works with spec file, we are currently using tranquility for realtime ingestion. Do you see a need for realtime node while using tranquility?

a) If realtime nodes are not required, then how will they sync for real time queries and segment hand-offs.

a) If required, what would be my spec file for realtime nodes? (any examples would greatly help)

b) Do you recommend a better architecture for the same use case. [we send events from multiple sources into kafka queue , then via storm + tranquility --> overlord via ZK]

Please see the basic bolt config for tranquility below.

---------------------- Tranquility bolt --------------

public class MyBeamFactory implements BeamFactory {

@Override

public Beam makeBeam(Map map, backtype.storm.task.IMetricsContext iMetricsContext) {

CuratorFramework curatorFramework = CuratorFrameworkFactory.newClient(map.get(“tranquility.zk.connect”).toString(), new BoundedExponentialBackoffRetry(100, 1000, 5));

curatorFramework.start();

String indexService = “druid:prod:overlord”; // Your overlord’s druid.service, with slashes replaced by colons.

String firehosePattern = “druid:prod:firehose:%s”; // Make up a service pattern, include %s somewhere in it.

String discoveryPath = “/druid/prod/discovery”; // Your overlord’s druid.discovery.curator.path.

String dataSource = “wikipedia”;

List dimensions = ImmutableList.of(“page”,“language”,“user”,“unpatrolled”,“newPage”,“robot”,“anonymous”,“namespace”,“continent”,“country”,“region”,“city”);

List aggregators = ImmutableList. of(

new CountAggregatorFactory(“count”),

new DoubleSumAggregatorFactory(“added”, “added”),

new DoubleSumAggregatorFactory(“deleted”, “deleted”),

new DoubleSumAggregatorFactory(“delta”, “delta”)

);

return DruidBeams.builder(new Timestamper() {

public DateTime timestamp(String a) {

System.out.println(a);

//Long date = Long.parseLong(theMap.get(“timestamp”).toString());

return new DateTime(System.currentTimeMillis());

}

}).curator(curatorFramework)

.discoveryPath(discoveryPath)

.location(DruidLocation.create(indexService, firehosePattern, dataSource))

.rollup(DruidRollup.create(dimensions, aggregators, QueryGranularity.NONE))

.tuning(ClusteredBeamTuning.builder().segmentGranularity(Granularity.DAY).windowPeriod(new Period(“PT10M”)).replicants(1).partitions(1).build())

.buildBeam();

}

}

Hi Ram,
see Inline

There’s no need to use standalone realtime nodes if you’re using tranquility, but you do need an indexing service (overlord + middle managers). In general we recommend using standalone realtime nodes if you want druid to read directly from kafka, and using the indexing service if you want to push events into druid using storm, samza, or your own custom processor.

Thanks Nishant & Gian, we are able to ingest and query.
Based on your inputs, we have come up with an architecture, i have attached the same. please let me know your feedback.

It will be helpful, if we can add an arch diagram in the production cluster config page.

Thanks!

Ram

P.S: We can work on the creative, if our understanding is correct.

Hi Ram,

your diagram seems cool, adding a bit more to the communication between tranquility, overlord and peons,

under the hood tranquility talks to the overlord to create realtime index tasks.

overlord assigns the task to a middlemanager which creates a peon to run the realtime index task.

realtime index task announces itself in zookeeper service discovery from where tranquility discovers the peon running the task.

once the task is found tranquility sends the batched events directly to the peon (realtime index task) via HTTP POST requests.

Thanks Nishant, we have given a pull request for the docs update. Let me know, if you have any other suggestions.

–Ram

Thanks for the contribution :slight_smile: