Tranquility, Kafka & sequence of JSONs

Dears,

I hope this is the right place for this question. I use Druid 0.9.2, Tranquility 0.8.2 and Kafka 0.10.1. My pipeline is: -> Kafka -> Tranquility -> Druid. I’ve read Tranquility accepts “[ … ] either an array of JSON objects or a sequence of newline-delimited JSON objects [ … ]” via the HTTP API. I’m doing the latter, sequence of newline-delimited JSON objects (10KB buffers filled as much as possible), via Kafka. I’m clearly bumping in the case where only the first JSON object of the buffer is being indexed in Druid - i’m not sure yet, among Tranquility and Druid, which component is the culprit. I was wondering whether i’m just hitting something known or i’m just doing anything wrong ( :)) ) or you can help narrowing down towards the culprit.

Thanks,

Paolo

Dears,

I should have narrowed it down to Druid being the one originating the behaviour rather than Tranquility. In fact, Tranquility receivedCount and sentCount is the same, ie. all is received from Kafka is sent to Druid. Inserting at the end of one of the buffers a malformed JSON object (ie. just missing the timestamp field), invalidates the whole 10KB buffer and i get a unparseableCount of 1. So Tranquility appears to do its processing just fine; then something gets broken. Any pointers would be appreciated how to troubleshoot this further.

Thanks,

Paolo

PS: inserting in the pipeline a basic script that reads the 10KB buffer worth of JSON objects and splits them out so that each one is sent as a different message to Kafka makes all work OK. Thing is on high volumes of data not being able to buffer really multiplies the amount of resources needed by the setup.