Tranquility Throughput and Scaling

I’m using the following code to push events to Druid via Tranquility:

    long start = System.currentTimeMillis();

    final Future<Integer> numSentFuture = druidService.apply(eventBuffer);

    int numSent = Await.result(numSentFuture);

    long duration = System.currentTimeMillis() - start;

    LOGGER.info("Flushed [{}] events to druid in [{}] ms.", numSent, duration);

eventBuffer has 10-100K events in it, each is map with 20-30 fields.

I’m seeing performance of roughly 10K/s on a single machine.

The performance doesn’t seem to be related to CPU.

Does that performance look about right?

I assume the performance is bottlenecked by the network overhead between the JVM and the Druid Peon?

Any parameters I should look at tuning, or scale it out horizontally from here?

-brian

Hey Brian,

First try updating to 0.7.2 if you aren’t already on it, and porting your code to use a Tranquilizer. That usually gets better throughput because it handles batching for you and also can potentially have many batches in flight at any given time.

You can do that by:

  1. Use buildTranquilizer instead of buildBeam or buildService on the DruidBeams builder.

  2. There’s some sample code in: https://github.com/gianm/tranquility/blob/config-refactor/core/src/test/java/com/metamx/tranquility/example/JavaExample.java

Note that the “DruidBeams.fromConfig” and “sender.flush()” methods don’t exist in master yet, they’re currently part of a PR. But you can still use tranquilizers.

If you care about knowing when a set of messages has flushed, then because “flush()” isn’t available yet, you do have to do a big of leg work. You could hold onto the Future objects and wait for them to resolve with something like: Await.result(Future.collect(… all the futures …)). That’ll work if you have small microbatches, but if you have big ones then that could be prohibitively expensive heap-wise. An alternative is to just track how many pending messages there are (e.g. with an atomiclong that you increment before you send & decrement in the callback) and waiting for that to reach zero when you want to make sure they’re flushed out.

Or you could pull the patch (https://github.com/druid-io/tranquility/pull/112) and build your own version.

Once you do that you can play around with batchSize and maxPendingBatches to see where you get the best throughput. The defaults should be pretty good but you might still see some benefit from tweaking them.

Perfect. Thanks for the quick response Gian.

Will do.

-brian

Gian,

Thanks. I moved to Tranquilizer and I am in the process of tuning the parameters.

Fortunately for this use case, we can operate in a “fire and forget” model and things look promising.

Performance is a bit spikes, presumably because of the period asynchronous flushing of the batches.

I’ll let you know where we end up with regard to throughput.

thanks again,

-brian

Brian, Do you mind sharing your DruidService call, I am trying to measure the performance through of the Tranquility Core API. Looks like you have a good starting point.

Can you share your recent numbers as well?

Thanks!

The apply method is on the Druid BeamService:
https://github.com/druid-io/tranquility/blob/b57ada09791d76f78675c683e07a872ecf438da9/core/src/main/scala/com/metamx/tranquility/finagle/BeamService.scala#L31

Our numbers haven’t changed all that much.

thanks,

-brian

But how are you constructing your spec- via json file or java? In fact, this is where my confusion is…how to construct the json spec for tranquility via java api and not using json.

If you want to set things up without json, you can use DruidBeams.builder rather than DruidBeams.fromConfig, see here for docs: http://static.druid.io/tranquility/api/latest/#com.metamx.tranquility.druid.DruidBeams$