Theta Sketch Calculation At Ingestion Time for filtered Rows

Hi All,
We have an requirement in our organization that we need to create Theta Sketch on Certain kind of Rows (i.e Rows having particular Dimension Value) at the ingestion time.

This we require because we need unique count across rows having particular dimension value so that theta sketch object is created only with required rows.

Is that possible?

We are using Druid 0.13

Waiting for faster inputs :slight_smile:


Pravesh Gupta

Hey Pravesh,

You should be able to do it by wrapping the ingest-time aggregator in a “filtered” aggregator.

Thanks for the reply Gian.

Wanted to ask you if we use filtered aggregation , Does it going to have an impact on Ingestion Time. We are talking about aroun 180 millions rows to be scanned for one segment (Day Wise) ?

It would definitely change the amount of time it takes to process a row, but I’m not sure if you would notice it or not.