Why does every kafka index service have so many shards?

hi,
I have some kafka index service tasks, I found there are more than 1 shard even though the data is small, like below:

the segment granularity is HOUR, the query granularity is MINUTE, how can I put the hourly segment into 1 shard? thank you so much

this is my ingestion json file:

curl -X POST -H ‘Content-Type: application/json’ -d '{

“type”: “kafka”,

“dataSchema”: {

“dataSource”: “shopee_id__order_sales”,

“parser”: {

“type”: “string”,

“parseSpec”: {

“format”: “json”,

“timestampSpec”: {

“column”: “event_time”,

“format”: “auto”

},

“dimensionsSpec”: {

“dimensions” : [“event”],

“dimensionExclusions”:

}

}

},

“metricsSpec”: [

{

“name” : “gmv”,

“type” : “doubleSum”,

“fieldName” : “gmv”

},

{

“type” : “hyperUnique”,

“name” : “uniq_order”,

“fieldName” : “orderid”,

“isInputHyperUnique” : false,

“round” : false

},

{

“type” : “hyperUnique”,

“name” : “uniq_user”,

“fieldName” : “userid”,

“isInputHyperUnique” : false,

“round” : false

}

],

“granularitySpec”: {

“type”: “uniform”,

“segmentGranularity”: “HOUR”,

“queryGranularity”: “MINUTE”

}

},

“tuningConfig”: {

“type”: “kafka”,

“resetOffsetAutomatically” : true,

“maxRowsPerSegment” : 7000000,

“chatThreads” : 10

},

“ioConfig”: {

“topic”: “seller_center_id__be_order_items”,

“consumerProperties”: {

“bootstrap.servers”: “shopee-kafka00:9092,shopee-kafka01:9092,shopee-kafka02:9092,shopee-kafka03:9092,shopee-kafka04:9092,shopee-kafka05:9092”

},

“taskCount”: 1,

“replicas”: 1,

“topicPattern.priority” : “1”

}

}’ http://druid-overlord00:8090/druid/indexer/v1/supervisor

Do you perhaps have many partitions on your Kafka topic? I believe each partition will result in a separate shard

yeah, there are more than 1 partition in this topic, but why does every segment has different shards num? and how can I control it into 1 shard in kafka index service?

I am still a beginner myself, but I don’t think you can control the shards that are created based on kafka partitions. my advice to you would be to only have as many partitions as taskCount in your kafka config. Then no extra shards will be created.

You can also run compaction tasks afterwards to compact the segments together. And I believe when Druid 0.13 comes out, it will have a way to set up auto compaction.

OK,I see, do you know the release time for Druid 0.13?