Druid - custom aggregation is slow

Hi everyone,

I just wondered that custom aggregation functions decrease significantly druid performances. I tried to use a very simple one:

“aggregations”: [

{
    "type": "javascript",
    "name": "price_USD",
    "fieldNames": ["__time", "my_price"],
    "fnAggregate" : "function(current, timestamp, val) { factor = [1.0961, 1.0938, 1.0888]; return current + val * factor[0]; }",
    "fnCombine"   : "function(partialA, partialB) { return partialA + partialB; }",
    "fnReset"     : "function()                   { return 0.0; }"
}

]

``

and the I got the following timing results:

  • no aggregation: 3602 ms
  • using aggregation: 20628 ms

Am I doing something wrong or custom aggregation functions are heavy to compute for Druid?

It definitely takes more time to compute an aggregation than to not compute an aggregation. Also, I’m not sure if that “factor = [1.0961, 1.0938, 1.0888]” means an array allocation and garbage collection will happen once for every row, but if so then that would add up. If you’re concerned about performance then the best way to do a custom aggregator is to write an extension rather than use javascript; it’s about 3–5x faster in my experience.