Can Druid support to compact with new queryable Granularity?

In the linked offical doc, it only gives us an example of Compact the data with new segment granularity but not queryable granularity. Is it doable to compact with new queryable granularity?

And I am confusing about the goal of having two different levels granularity, segment granularity and queryable granularity. As my knowledge, the Druid would truncate the event timestamp and do roll up based on queryable granularity. Thus queryable granularity could reduce the date size and speed up query. What’s the purpose of segment granularity? For example, queryable granularity is minutely and segment granularity is hourly. The use case of segment granularity I could come up with is to speed up GroupBy query with granularity >= hourly. Druid would do further hourly pre-aggregation apart from minutely rolling up. Please correct me if I am wrong.

Xuanyi,

Segment Granularity helps with two things - 1) bucketing your segments in a time space, 2) controlling the size of your segments.

For item (1), the granularity could be DAY, WEEK, MONTH, YEAR, etc. Think of windows machine where you have directories in the C: drive. The segment granularity is similar to a directory. It’s a placeholder to store segments. You have to find the balance of not creating too many directories and a good example of this is defining the granularity at MINUTE level. It will make sense to use MINUTE if you are getting GBs of segments but this is very rare.

(2), Your segment size will influence your query speeds. Make sure that your segments are close to 700MB for optimal performance.

Query Granularity on the other hand, has nothing to do with Segment Granularity. This is basically dictating how coarse or fine you want your query to be from a time perspective ie querying MILLISECOND, SECOND, MINUTE, HOUR, etc.

Rommel Garcia

Thank you for your reply. But do you have any answer for the first question? Can Druid support to compact with new queryable Granularity?

Best,

Xuanyi

Compaction doesn’t allow you to change the queryGranularity, you will have to reindex it using hadoop based or native batch ingest.

Rommel Garcia