How to config the size of shard,which load data from kafka?

I use indexing service to load data from kafka.the data from kafka is not large.each day no greater 50m. so the size of the shard generate by indexing service is small.
I want to generate a segment by a day.
use:
"segmentGranularity": "DAY"

but,a segments contain many small shards.each shard no more than 10m.
i have config the
  "tuningConfig": {
    "type": "kafka",
    "maxRowsPerSegment": 500000000
  },
it also generate a lot small shard.how can i config it generate a big shard.

What is the average segment size you are getting?

Rommel Garcia
Director, Field Engineering

a day contains a segment, which have 48 shards.

the size of a shard is between 600KB~800KB

Hi :

would you attach one of the ingestion task logs here? it may provide us more info about why it has to keep creating new shards.

Another method is you can run post ingestion compaction task regularly to merge the small shards into more optimized ones.

Thanks

Ming

thanks,
but,there's another question, I druid cluster can't print any log in the integration task.
i don't know why, a task complete, and just print:
Thread-2 ERROR Unable to register shutdown hook because JVM is shutting down. java.lang.IllegalStateException: Not started
    at io.druid.common.config.Log4jShutdown.addShutdownCallback(Log4jShutdown.java:45)
    at org.apache.logging.log4j.core.impl.Log4jContextFactory.addShutdownCallback(Log4jContextFactory.java:273)
    at org.apache.logging.log4j.core.LoggerContext.setUpShutdownHook(LoggerContext.java:256)
    at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:216)
    at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:145)
    at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41)
    at org.apache.logging.log4j.LogManager.getContext(LogManager.java:182)
    at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:103)
    at org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:43)
    at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42)
    at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:29)
    at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:253)
    at org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155)
    at org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132)
    at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:273)
    at org.apache.hadoop.hdfs.LeaseRenewer.<clinit>(LeaseRenewer.java:72)
    at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:699)
    at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:859)
    at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:853)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2407)
    at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2424)
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

probably druid user does not have writing privilege to the “druid.indexer.logs.directory” , it’s configured in common.runtime.properties?

thanks,I have solve it。

it's because the taskDuration, and I turn up it.