Load balance of realtime ingest task with tranquility

Hi all,

We use tranquility to ingest the realtime data, and I want to know how to balance the load to multiple realtime ingest task for the same datasource(for example, for task.replicants=1, there is only one task for a datasource)? Is ‘task.partitions’ the answer? If so, how does partition works? When a single data comes in, which partition does it go? Thanks.

Hey Jiao,

tasks.partitions refers to how many Druid segments you’ll get per segmentGranularity period. So if you have segmentGranularity “day” and tasks.partitions = 2 then you get two per day. Usually people set this to a number that gets them good performance and reasonable sized segments.

The default partitioning strategy in Tranquility 0.8.0+ is to partition based on a hash of all the dimensions of your events, which is usually a good strategy. If you want another one you can override it in code by providing a custom Partitioner class (there’s no way to override the default strategy with purely configuration).