I have a use case where i need to pass multiple values to the IN operator in DRUID. The values may range from hundreds to thousands. Read the documentation but did not find anywhere how many values the IN operator accepts.
Is there any documentation where i can find how many values DRUID accepts in IN operator.
select * from ‘datasourcename’ where id in (1,2,3,4,…20000);
In the example i have given 1,2,3 continuous numbers but in reality it will be replaced with the ids that i get from different database where the ids won’t be sequential.
As most of the data bases have limit on the IN operator wanted to know if that is the same case with DRUID. Did not try passing those many ids in DRUID. As a alternative used sub queries to accomplish that task by having the required table in DRUID.
Will try passing the ids to IN operator directly and test it.
There is no specific limit as far as I know. From a practical side if you are doing 20k+ union operations on the underlying bitmaps, you might see performance impacts depending on if you are using Roaring or Concise bitmap compression. Usually Roaring is faster but sometimes larger in size (usually worth it though).
There isn’t a specific limit to the number of operators in the IN clause. Be aware though that once you get into the thousand-tens of thousands range, planning the SQL query can take minutes: https://github.com/apache/incubator-druid/issues/7904. If you’re issuing these queries programmatically, we found it worth while to bite the bullet and move to native queries to avoid this.