I am implementing druid in our firm and trying to understand how druid query internals work. I couldn’t find any relevant documentation regarding how when a query is fired, how would the data in the segments be filtered. I looked at how the segments are being created
- Could someone please guide me as to how will the bitmaps in the dimensions know how to map to the tuple id so the corresponding other column values in the tuples is fetched?
For example -
Bitmaps - one for each unique value of the column
value=“Justin Bieber”: [1,1,0,0]
In the above example, we see that Justin Bieber is present in tuples 1 and 2. Is this the only way how each column value maps to the corresponding tuple id?
- The bitmaps work perfectly for boolean filters such as AND,OR, NOT. How would range queries work for metrics such as select * from wikipedia where pageviews > 1000?
and similarly how can the metrics map to the tuple id to gather corresponding timestamp or dimensions values?