Has anyone calculated statistical mode on a druid dataset?

Problem statement: Statistical mode tells us about the data point that is most frequently repeated in the dataset.

e.g. Given a dataset like below, try to Calculate the most representative sale bucket for each product

Dataset:

Product ID
Month
Sale Bucket
Pid1
January
10-20
Pid1
February
0-10
Pid1
March
10-20
Pid2
January
0-10
Pid2
February
10-20
Pid2
March
0-10

Expected output:

Product ID
Sale Bucket
Pid1
10-20
Pid2
0-10

These is what I have tried till now:

  • Tried a simple google search to find if there was something in-built or an extension to do the same. Couldn’t find one.
  • Tried My own naive approach to perform the mode over a dataset:
  1. Getting the relevant frequencies of the data point (Group By over Product ID and Sale bucket)
  • Resulting in
    Product ID
    Sale Bucket
    Frequency
    Pid1
    10-20
    2
    Pid1
    0-10
    1
    Pid2
    10-20
    1
    Pid2
    0-10
    2
  • Getting the relevant data point with maximum Frequency per Product (out of the Frequencies calculated in 1). I am stuck at this point 2, having tried 2 approaches:

Does anyone has any ideas on how to proceed further?