Hadoop ingestion task never completes with the numShards field

I am trying to experiment with the Druid tutorial for loading data from Hadoop clusters by adding the numShards field to partitionsSpec in order to generate 2 partitions. When I run this job, it keeps on running the indexing task and never finishes, there is no error in the logs, please offer any advice to fix this.

The only thing that I modify in the ingestion spec file(wikipedia-index-hadoop.json) found in the quickstrt/tutotial is the numShards in the partitionsSpec:

“tuningConfig” : {

 "type" : "hadoop",

 "partitionsSpec" : {

   "type" : "hashed",

   "numShards" : 2

 }

``

I keep on getting this output which never finishes indexing:

Task index_hadoop_wikipedia_2020-03-12T18:15:31.155Z still running…

Task index_hadoop_wikipedia_2020-03-12T18:15:31.155Z still running…

Task index_hadoop_wikipedia_2020-03-12T18:15:31.155Z still running…

Task index_hadoop_wikipedia_2020-03-12T18:15:31.155Z still running…

Task index_hadoop_wikipedia_2020-03-12T18:15:31.155Z still running…

Task index_hadoop_wikipedia_2020-03-12T18:15:31.155Z still running…

Task index_hadoop_wikipedia_2020-03-12T18:15:31.155Z still running…

Task index_hadoop_wikipedia_2020-03-12T18:15:31.155Z still running…

Task index_hadoop_wikipedia_2020-03-12T18:15:31.155Z still running…

``

Here are the logs for the hadoop task:

2020-03-12T16:49:58,035 INFO [task-runner-0-priority-0] org.apache.druid.indexer.path.StaticPathSpec - Adding paths[/quickstart/wikiticker-2015-09-12-sampled.json.gz]

2020-03-12T16:49:58,037 INFO [task-runner-0-priority-0] org.apache.druid.indexer.HadoopDruidIndexerJob - No metadataStorageUpdaterJob set in the config. This is cool if you are running a hadoop index task, otherwise nothing will be uploaded to database.

2020-03-12T16:49:58,047 INFO [task-runner-0-priority-0] org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS

2020-03-12T16:49:58,051 INFO [task-runner-0-priority-0] org.apache.druid.indexer.path.StaticPathSpec - Adding paths[/quickstart/wikiticker-2015-09-12-sampled.json.gz]

2020-03-12T16:49:58,878 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at druid-hadoop-demo/127.0.0.1:8032

2020-03-12T16:49:59,024 WARN [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobResourceUploader - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

2020-03-12T16:49:59,035 WARN [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String).

2020-03-12T16:49:59,344 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input files to process : 1

2020-03-12T16:49:59,491 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1

2020-03-12T16:49:59,500 INFO [task-runner-0-priority-0] org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS

2020-03-12T16:49:59,561 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1584029074330_0005

2020-03-12T16:49:59,650 INFO [task-runner-0-priority-0] org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.

2020-03-12T16:50:00,975 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1584029074330_0005

2020-03-12T16:50:01,001 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - The url to track the job: http://druid-hadoop-demo:8088/proxy/application_1584029074330_0005/

2020-03-12T16:50:01,001 INFO [task-runner-0-priority-0] org.apache.druid.indexer.IndexGeneratorJob - Job wikipedia-index-generator-Optional.of([2015-09-12T00:00:00.000Z/2015-09-13T00:00:00.000Z]) submitted, status available at http://druid-hadoop-demo:8088/proxy/application_1584029074330_0005/

2020-03-12T16:50:01,002 INFO [task-runner-0-priority-0] org.apache.druid.indexer.JobHelper - MR job id [job_1584029074330_0005] is written to the file [/Users/asandhu/Desktop/Flurry_Details/Druid/setup/druid-test/apache-druid-0.15.1-incubating/var/druid/task/index_hadoop_wikipedia_2020-03-12T16:49:51.542Z/mapReduceJobId.json]

2020-03-12T16:50:01,002 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Running job: job_1584029074330_0005

``