Sorting by stringLast aggregator values in groupBy queries

Hi,

I wondered if anyone else has experienced any issues sorting by stringLast aggregators in groupBy queries in 0.13.0?

My sample queries look to be returning unsorted results when attempting to sort by a stringLast aggregator however longLast numeric sorts are working as expected?

This query seems to return results in an unordered (with respect to name) manner.

It looks like the results might be being sorted by the dimension value (personId) rather than the string metric.

{

“queryType”: “groupBy”,

“dataSource”: { “type”: “table”, “name”: “myTable” },

“intervals”: { “type”: “intervals”, “intervals”: [ “-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z” ] },

“granularity”: { “type”: “all” },

“dimensions”: [ “personId” ],

“aggregations”: [

{ "type": "stringLast", "name": "name", "fieldName": "name" },
{ "type": "longLast",   "name": "age", "fieldName": "age" }

],

“limitSpec”: {

"type": "default", "limit": 1000,
"columns" : [{ "dimension":**"name"**,"direction":"ascending","dimensionOrder":"alphanumeric"}]

},

“context”: {}

}

In contrast, this query will return results ordered by age as expected.

{

“queryType”: “groupBy”,

“dataSource”: { “type”: “table”, “name”: “myTable” },

“intervals”: { “type”: “intervals”, “intervals”: [ “-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z” ] },

“granularity”: { “type”: “all” },

“dimensions”: [ “personId” ],

“aggregations”: [

{ "type": "stringLast", "name": "name", "fieldName": "name" },
{ "type": "longLast",   "name": "age", "fieldName": "age" }

],

“limitSpec”: {

"type": "default", "limit": 1000,
"columns" : [{ "dimension":"age","direction":"ascending","dimensionOrder":"alphanumeric"}]

},

“context”: {}

}

Does this look like a bug, or am I specifying something incorrectly in the query?

Cheers,

-Ben

Note, I have been able to work around this by referencing this aggregation directly in a postAggregator and then sorting by that (inspiration came from the test case in GitHub).

The following query sorts as expected:

{

“queryType”: “groupBy”,

“dataSource”: { “type”: “table”, “name”: “myTable” },

“intervals”: { “type”: “intervals”, “intervals”: [ “-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z” ] },

“granularity”: { “type”: “all” },

“dimensions”: [ “personId” ],

“aggregations”: [

{ "type": "stringLast", "name": "name", "fieldName": "name" },

{ "type": "longLast",   "name": "age", "fieldName": "age" }

],

“postAggregations”: [

{ "type":"expression", "name":"sortColumn", "expression":"name" }

]

“limitSpec”: {

"type": "default", "limit": 1000,

"columns" : [{ "dimension":**"**sortColumn**"**,"direction":"ascending","dimensionOrder":"alphanumeric"}]

},

“context”: {}

Cheers,

-Ben

Hi Ben,

I’m not sure what’s happening, but this sounds like a bug, could you please file an issue for this?

Thanks,

Jon

Filed an issue at https://github.com/apache/incubator-druid/issues/7691