Hi! I’m looking to setup a small Druid cluster for evaluation in my office, and I’m a little stuck on how best to stream data into Druid via the Indexing Service.
We’re pretty invested in Samza for dealing with all of our streaming data from Kafka, but I’ll admit right away that I’m not terribly familiar with Samza’s paradigm - and that may very well be why I’m confused about Tranquility!
I’m getting lost trying to wrap my head around the example Samza task shown in the Github readme (https://github.com/druid-io/tranquility/blob/master/README.md).
Is this code example something that could (with an appropriate data set) be run all by itself as an independent Samza stream task? Or am I supposed to implement MyBeamFactory alongside my stream task and feed data to this class? Or does it expect to be run apart from Samza and fed data from the stream task while running as its own process?
My apologies if these are noob-ish questions, but as an Ops guy my dev chops are a bit weak!
FWIW, I would just try to dump the example code into a Samza task and run it to see what it does, but our Samza environment is entirely Clojure-based… so getting it ported will take me a while, and it will be easier on me if I understand where Tranquility fits into the puzzle so that I can build my task up from scratch.
Thanks for your time!