Select/ Scan in Druid


I need to find

“Select DISTINCT countryName from wikipedia. How do I write that in Druid.”

I researched on Select Query, but not sure what to write in pagingSpecs and Threshold, I need all the distinct records.

I even researched on Scan Query, but I am not sure how to get distinct records. Also, on a large dataset (thousands of records), it is getting stuck



Also, scan is used for streaming? So how do I use it for batch?

You can use Druid SQL select distinct

select distinct “country_name”

from wikipedia2

Or Druid Native -


“queryType”: “topN”,

“dataSource”: {

“type”: “table”,

“name”: “wikipedia2”


“virtualColumns”: ,

“dimension”: {

“type”: “default”,

“dimension”: “country_name”,

“outputName”: “d0”,

“outputType”: “STRING”


“metric”: {

“type”: “dimension”,

“previousStop”: null,

“ordering”: {

“type”: “lexicographic”



“threshold”: 5000,

“intervals”: {

“type”: “intervals”,

“intervals”: [




“filter”: null,

“granularity”: {

“type”: “all”


“aggregations”: ,

“postAggregations”: ,

“descending”: false


The SQL syntax would be



“query”: “SELECT DISTINCT country_name FROM wikipedia2”


$ curl -XPOST -H’Content-Type: application/json’ http://localhost:8082/druid/v2/sql/ -d @query.sql


