Parallel tasks not working in 0.14?

I am running a “index_parallel” type task and it doesnt seem to use all my 15 workers… I submitted the spec through the new console (tasks page). Wanted to make sure if I am missing something or if this is a bug.

Sample payload -

{

“type”: “index_parallel”,

“spec”: {

“dataSchema”: {

“dataSource”: “ffsdataset”,

“parser”: {

“type”: “string”,

“parseSpec”: {

“type” : “jsonLowercase”,

“format”: “json”,

“dimensionsSpec”: {

“dimensions”: [

“id”,

“name”

]

},

“timestampSpec”: {

“column”: “date”,

“format”: “iso”

}

}

},

“metricsSpec”: [{

“type”: “count”,

“name”: “count”

},

    {

      "type": "doubleSum",

      "name": "metric",

      "fieldName": "metric"

    }

],

“granularitySpec”: {

“type”: “uniform”,

“segmentGranularity”: “day”,

“queryGranularity”: “day”,

“intervals”: [“2001-02-01/2019-03-31”],

“rollup”: true

}

},

“ioConfig”: {

“type”: “index_parallel”,

“firehose”: {

“type”: “static-s3”,

“uris”: [“s3://file1”,“s3://file2”,“s3://file3”]

},

“appendToExisting”: true

},

“tuningConfig”: {

“type”: “index_parallel”,

“targetPartitionSize”: 500000000,

“maxRowsInMemory”: 500000,

“forceExtendableShardSpecs”: true,

“forceGuaranteedRollup”: true,

“partitionDimensions”: [“id”]

}

}

}

``

Hey Karthik,

There’s a new “maxNumSubTasks” property that you should set to the number of sub-tasks you want to use. Its default is 1, meaning no subtasks.

Oh this is convenient as well. On similar note, is there an API with which I can determine the total number of workers across all the middlemanagers?

Try sending a GET request to http://overlord-host:port/druid/indexer/v1/workers
It will provide you the list of available workers a.k.a middlemanagers.

Thanks @nishant… I can work with that. I am writing a small etl service which can definitely use this information

Hi Gian, Karthik,

To which runtime configuration the maxNumSubTasks be set ?

Thanks,
Esthove

It’s part of the tuning config: http://druid.io/docs/latest/ingestion/native_tasks.html

Thank you very much, got the index_parallel task working with this config update.

-Esthove