If you’re reading this, you’ve probably already seen the following language in the docs:
For Druid to operate well under heavy query load, it is important for the segment file size to be within the recommended range of 300MB-700MB.
But, like so many Druid things, this can be taken as a guideline rather than a hard and fast rule. Much bigger and much smaller segments can both work too. The main downside of much bigger segments is poor parallelization. The main downside of much smaller segments is high overhead per segment.
This tidbit came out of a discussion within the Apache Druid workspace. Feel free to join us there. Here’s the link to the complete discussion.