Query internals: in segments how does the bitmaps map to the tuple id?

Hi,
I am implementing druid in our firm and trying to understand how druid query internals work. I couldn’t find any relevant documentation regarding how when a query is fired, how would the data in the segments be filtered. I looked at how the segments are being created

  1. Could someone please guide me as to how will the bitmaps in the dimensions know how to map to the tuple id so the corresponding other column values in the tuples is fetched?

For example -

Bitmaps - one for each unique value of the column

value=“Justin Bieber”: [1,1,0,0]

value=“Ke$ha”: [0,0,1,1]

In the above example, we see that Justin Bieber is present in tuples 1 and 2. Is this the only way how each column value maps to the corresponding tuple id?

  1. The bitmaps work perfectly for boolean filters such as AND,OR, NOT. How would range queries work for metrics such as select * from wikipedia where pageviews > 1000?

and similarly how can the metrics map to the tuple id to gather corresponding timestamp or dimensions values?

Thank you

Hi Balaji,

Each string column has both a forward index mapping row number to values, and an inverted bitmap index mapping values to row numbers. Druid will use one or the other depending on what kind of query you are doing. It will usually use the forward index for grouping or selecting values, and the inverted index for filtering, although it may use the forward index for filtering sometimes.

For range queries on numeric columns, we don’t currently have indexes to use, and so instead we scan the column row by row and read its values.

Hope this clarifies things for you.