Group By Timeout

Hi Everyone, We recently setup druid 0.18 and ingest a dataset from kafka in order to evaluate it.
We experiencing the following issue when our query contains a GROUP BY and an AGGREGATION function (count(*)).

Did anyone else experience the same issue?

Btw the same query without the count(*) is running ok.

Thanks

Error Message: Query timeout / Timeout waiting for task. / java.util.concurrent.TimeoutException / on host druid-my-druid-historicals-0.*

It means that the query is taking longer than what your timeout settings are set to. You can run an explain to see what the difference is in the query plan to make sure Druid is doing what you want. Send over both of them and we can help you figure it out.

druid.server.http.defaultQueryTimeout

Hi Rachel here are the query plans for both queries

Datasource Size:
~20MB
~41k rows

Query1:
Group By with count() - Timeout

"queryType": "topN",
"dataSource": {
"type": "table",
"name": "compliance-reports"
},
"virtualColumns": [],
"dimension": {
"type": "default",
"dimension": "pbx_status",
"outputName": "d0",
"outputType": "STRING"
},
"metric": {
"type": "dimension",
"previousStop": null,
"ordering": {
"type": "lexicographic"
}
},
"threshold": 100,
"intervals": {
"type": "intervals",
"intervals": [
"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
]
},
"filter": null,
"granularity": {
"type": "all"
},
"aggregations": [
{
"type": "count",
"name": "a0"
}
],
"postAggregations": [],
"context": {
"sqlOuterLimit": 100,
"sqlQueryId": "21f58119-c113-476e-a739-abaea7b2b419"
},
"descending": false
}```

Query2:
Group By without count() - Query Time: 0.4s
```{
"queryType": "topN",
"dataSource": {
"type": "table",
"name": "compliance-reports"
},
"virtualColumns": [],
"dimension": {
"type": "default",
"dimension": "pbx_status",
"outputName": "d0",
"outputType": "STRING"
},
"metric": {
"type": "dimension",
"previousStop": null,
"ordering": {
"type": "lexicographic"
}
},
"threshold": 100,
"intervals": {
"type": "intervals",
"intervals": [
"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
]
},
"filter": null,
"granularity": {
"type": "all"
},
"aggregations": [],
"postAggregations": [],
"context": {
"sqlOuterLimit": 100,
"sqlQueryId": "e5a52624-e2e1-4337-a5af-216e09feb934"
},
"descending": false
}```

When the query is executed using a specific time period, do you still get a timeout?

Feel free to post your Sql here :slight_smile:

Also, how much data are you querying? How many segments have been created? (One segment should be about 5m rows)