How to group records into N(say 2) groups using a dimension with X different values?

Hi,

In my data, I have a dimension(dim) which can have only X different values(say A, B, C, …N).

Now, I want to count how many records have dim in the range(A-F) and how many records have dim in the range(G-N) ?

I am thinking of firing 2 queries to druid -

  1. Filter: (dim=A OR dim =B OR …dim =F) and count aggregator over records.

  2. Filter: (dim= G OR dim =H OR …dim =N) and count aggregator over records.

Is it the correct approach ?

Thanks.

Hey Abhiskeh,

You could use a lookup to map the values into the groups you’d like as part of the query

Here’a an example using the __time dimension to group on the part of day an event occurred.

{

“aggregations”: [

{

“type”: “longSum”,

“fieldName”: “stuff”,

“name”: “sum__of_stuff”

}

],

“intervals”: “2018-01-01T00:00:00+00:00/2018-02-01T00:00:00+00:00”,

“dataSource”: “myDataSource”,

“granularity”: “all”,

“postAggregations”: ,

“queryType”: “groupBy”,

“dimensions”: [

{

“type”: “extraction”,

“dimension”: “__time”,

“outputName”: “timeOfDay”,

“extractionFn”: {

“type”: “cascade”,

“extractionFns”: [

{

“type”: “timeFormat”,

“format”: “H”,

“locale”: “en”

},

{

“type”: “lookup”,

“lookup”: {

“type”: “map”,

“map”: {

“0”: “early_morning”,

“1”: “early_morning”,

“2”: “early_morning”,

“3”: “early_morning”,

“4”: “early_morning”,

“5”: “early_morning”,

“6”: “morning”,

“7”: “morning”,

“8”: “morning”,

“9”: “morning”,

“10”: “morning”,

“11”: “morning”,

“12”: “afternoon”,

“13”: “afternoon”,

“14”: “afternoon”,

“15”: “afternoon”,

“16”: “afternoon”,

“17”: “evening”,

“18”: “evening”,

“19”: “evening”,

“20”: “evening”,

“21”: “night”,

“22”: “night”,

“23”: “night”

}

},

“retainMissingValue”: true,

“injective”: true

}

]

}

}

]

}

``