Maximum number of rows [500000] reached

Hi Druid Gurus,

We are using Druid in combination with Plyql+Tableau. In the Tableau dashboard, we are populating all accounts (customer’s business name) and have enabled a search button for end user to search and select.

This strategy is working well but is failing now as one of our data set has close to 2.1 million accounts. Whenever Plyql+Tableau was triggering a query to get the list of accounts, the query was failing as Druid’s query result limit is set to 500,000. We have been actually having the below config in our broker’s runtime.properties

druid.query.groupBy.maxIntermediateRows=20000000

druid.query.groupBy.maxResults=20000000

But the below Druid query is failing with the error shown below. How do we increase the result row limit? Is there any other place where we need to mention this setting?

{

“queryType”: “groupBy”,

“dataSource”: “ai_credit_by_account”,

“granularity”: “all”,

“dimensions”: [

“account_id”

],

“aggregations”: [

{

“type”: “doubleSum”,

“name”: “usd_amt”,

“fieldName”: “usd_amt”

}

],

“intervals”: [

“2016-11-01/2017-12-02”

]

}

[pp_paz_pci_admin@ccg01druid02 queries]$ curl -X ‘POST’ -H ‘Content-Type:application/json’ -d @ai_credit_by_account.json ccg01druid02.ccg01.phx.inc.com:8082/druid/v2/?pretty

{

“error” : “Resource limit exceeded”,

“errorMessage” : “Maximum number of rows [500000] reached”,

“errorClass” : “io.druid.query.ResourceLimitExceededException”,

“host” : null

}

Hey Kasi,

That’s a groupBy v1 error; If you upgrade to Druid 0.10.0+ then you’ll get groupBy v2 by default, which doesn’t have a row-based limit. It does have byte-based limits, which you can read about on http://druid.io/docs/latest/querying/groupbyquery.html. Note also that the error messages for exceeding those limits will improve and become more clear in Druid 0.10.1, which has a release candidate out now.