I have a use case where one of my data sources has one week worth of data per segment right now. The size of my segments is around 500MB because I have configured my coordinator to merge them to that size when possible.
Now I would like my queries to be more parallelized when I query one week worth of data. In order to do that I could decrease the size of my segments for this data source so that my queries get automatically processed on more nodes. I could decrease the optimal segment size on the coordinator configuration but then my other segments for my data sources will be impacted right? I there a way to configure this merging feature at the data source level?
I can’t do this at ingestion time because my pipelines process only one hour of data and always produce small segments. Any idea how I could achieve that?
Thanks for your comments.