how to set partition number and replication number of kafka topic

Hi, all,
I am using Druid-0.9.1.1 and loading data from Kafka at real time.

The topic in Kafka is devilfish_ntes_log, and the topic information is:

Topic:devilfish_ntes_log PartitionCount:2 ReplicationFactor:1 Configs:

Topic: devilfish_ntes_log Partition: 0 Leader: 2 Replicas: 2 Isr: 2

Topic: devilfish_ntes_log Partition: 1 Leader: 0 Replicas: 0 Isr: 0

I set the partition number to 2, and the replication number to 1.

While, the question is when I start the index service, only one node start one process to read data from the topic. Is that right?

I mean if the topic has 2 partitions, then Druid should start 2 process to read data to improve performance.

And how should I set the partition number and replication number? If I want Druid to start more than one process to read data at the same time?

Thanks.

Hi Yufeng,

You can set “taskCount”:2 in the ioConfig section of the supervisor spec and the partitions will be split between the tasks so that each task takes one partition. If you want more replica tasks in Druid for redundancy, you can set “replicas”:n, but you’ll need at least n middle managers so the replicas can run on different nodes. See http://druid.io/docs/0.9.1.1/development/extensions-core/kafka-ingestion.html for more information.

Thank you, David. It works :slight_smile:

在 2016年9月8日星期四 UTC+8上午3:09:47,David Lim写道: