Hi,
I was going through secondary partition usage on https://groups.google.com/forum/#!searchin/druid-user/partitionDimensions|sort:date/druid-user/QxZLwtoehFI/G4U0L8NkBgAJ.
I have following questions regarding secondary partition.
a. Single dimension partition as given below
“partitionsSpec”: {
"type": "dimension",
“targetPartitionSize”: 100000000,
“partitionDimension”: “”
}
Is this functionality available only for hadoop indexing? Is it possible to achieve similar functionality using native (index) indexing?
b. I was trying the following using native (index) indexing. The data source is single zip file and it has data for 31 days. segmentGranularity is set to ‘DAY’, so 31 “time” segments.
i have geography_desc having 6 values (North America, South America, EU…). 90% of the queries are on this dimension. So trying partitioning on this dimension.
“tuningConfig”: {
“type”: “index”,
“partitionDimensions”: [ “geography_desc” ],
“maxRowsPerSegment”: 50000,
“maxTotalRows”: 20000
}
Using the above config, the indexing task created thousands of partitions, as expected. We intentionally set the maxrows to 50000, just to check how segments are getting created.
But if we use the spec to following
“tuningConfig”: {
“type”: “index”,
“partitionDimensions”: [ “geography_desc” ],
“maxRowsPerSegment”: 50000,
“maxTotalRows”: 20000,
“forceGuaranteedRollup”: true
}
This created <100 segments, each segment with max 50000 rows. But when we do count(*), we are getting only 2000+ as values. In the unified console, it shows all segments, but numRows as 0 for except 1 segment(1 day). Incidentally this 1 segment (1 day), the number of rows is <50000, so it created only 1 segment for the day. So we were trying to understand where did all the segments data disappear (segments are showing up in the unified console).
After looking at the deepstorage, we realized the partitions are with “1”,“2”… but “0” is missing. only for the 1 day (which has <50000 rows), the segment is “0” and only this segment data is returning in query. So i think for all other days the “0” th segment is missing (i am guessing). Could this be a bug?
BTW, this was done on 0.15.0 version.
Any comments/input is welcome.
Regards, Chari.