Combine old data with new data and overwrite option not working in druid

Submitted task failing with out any error.

My requirement is to

update the data if already exists,insert if it is not exits.

Three type of tasks are available:

1.Overwrite the initial data

2.Combine old data with new data and overwrite

3.Append to the data

1st and 3rd options are working .

when i submitting the task with 2nd option task failed.

below are the logs

uid.metrics.emitter.dimension.taskId=index_testtutorial_2019-05-30T13:43:05.513Z -Ddruid.host=ip-172-16-243-48.eu-west-1.compute.internal -Ddruid.port=8105 -Ddruid.tlsPort=-1 io.druid.cli.Main internal peon /apps/datafiles_2/druid-storage/middlemanagertasks/prdrduk03a/index_testtutorial_2019-05-30T13:43:05.513Z/task.json /apps/datafiles_2/druid-storage/middlemanagertasks/prdrduk03a index_testtutorial_2019-05-30T13:43:05.513Z/4fd45592-b710-4d23-a2ba-1b17f67d1bf7/status.json
MiddleManager_prdrduk03a-druid.log:2019-May-30 13:43:05 PM [forking-task-runner-11] INFO io.druid.indexing.overlord.TaskRunnerUtils - Task [index_testtutorial_2019-05-30T13:43:05.513Z] location changed to [TaskLocation{host=‘ip-172-16-243-48.eu-west-1.compute.internal’, port=8105, tlsPort=-1}].
MiddleManager_prdrduk03a-druid.log:2019-May-30 13:43:05 PM [forking-task-runner-11] INFO io.druid.indexing.overlord.TaskRunnerUtils - Task [index_testtutorial_2019-05-30T13:43:05.513Z] status changed to [RUNNING].
MiddleManager_prdrduk03a-druid.log:2019-May-30 13:43:05 PM [forking-task-runner-11] INFO io.druid.indexing.overlord.ForkingTaskRunner - Logging task index_testtutorial_2019-05-30T13:43:05.513Z output to: /apps/datafiles_2/druid-storage/middlemanagertasks/prdrduk03a/index_testtutorial_2019-05-30T13:43:05.513Z/log
MiddleManager_prdrduk03a-druid.log:2019-May-30 13:43:05 PM [WorkerTaskMonitor] INFO io.druid.indexing.worker.WorkerTaskMonitor - Updating task [index_testtutorial_2019-05-30T13:43:05.513Z] announcement with location [TaskLocation{host=‘ip-172-16-243-48.eu-west-1.compute.internal’, port=8105, tlsPort=-1}]
MiddleManager_prdrduk03a-druid.log:2019-May-30 13:43:11 PM [forking-task-runner-11-[index_testtutorial_2019-05-30T13:43:05.513Z]] INFO io.druid.indexing.overlord.ForkingTaskRunner - Process exited with status[0] for task: index_testtutorial_2019-05-30T13:43:05.513Z
MiddleManager_prdrduk03a-druid.log:2019-May-30 13:43:11 PM [forking-task-runner-11] INFO io.druid.indexing.common.tasklogs.FileTaskLogs - Wrote task log to: /apps/datafiles_2/druid-storage/druid-segments/indexing-logs/index_testtutorial_2019-05-30T13:43:05.513Z.log
MiddleManager_prdrduk03a-druid.log:2019-May-30 13:43:11 PM [forking-task-runner-11] INFO io.druid.indexing.overlord.TaskRunnerUtils - Task [index_testtutorial_2019-05-30T13:43:05.513Z] status changed to [FAILED].
MiddleManager_prdrduk03a-druid.log:2019-May-30 13:43:11 PM [forking-task-runner-11] INFO io.druid.indexing.overlord.ForkingTaskRunner - Removing task directory: /apps/datafiles_2/druid-storage/middlemanagertasks/prdrduk03a/index_testtutorial_2019-05-30T13:43:05.513Z
MiddleManager_prdrduk03a-druid.log:2019-May-30 13:43:11 PM [WorkerTaskMonitor] INFO io.druid.indexing.worker.WorkerTaskMonitor - Job’s finished. Completed [index_testtutorial_2019-05-30T13:43:05.513Z] with status [FAILED]

Please suggust how to work out this option.

Thanks in advance.

Please reply me if any one used combine firehose.

Did you use appendToExist to False? This should do it. Please share your ingestion spec.

Yes Iam using appendToExist false only.

Below one is my index task json.

{
“context”: {
“taskLockTimeout”: “10200000”,
“priority”: 60
},
“type”: “index”,
“spec”: {
“dataSchema”: {
“metricsSpec”: [
{
“fieldName”: “datacount”,
“name”: “count”,
“type”: “longSum”
}
],
“parser”: {
“parseSpec”: {
“dimensionsSpec”: {
“dimensions”: [

“column1”,
“column2”
]
},
“columns”: [
“column1”,“column2”,“column3”,“createddate”
],
“format”: “csv”,
“timestampSpec”: {
“format”: “yyyy-MM-dd”,
“column”: “createddate”
}
},
“type”: “string”
},
“granularitySpec”: {
“intervals”: [
“2019-04-15/2019-06-15”
],
“segmentGranularity”: “day”,
“queryGranularity”: “day”,
“type”: “uniform”,
“rollup”: true
},
“dataSource”: “test_store2”
},
“ioConfig”: {
“type”: “index”,
“firehose”: {
“type”: “combining”,
“delegates”: [
{
“type”: “ingestSegment”,
“dataSource”: “test_store2”,
“interval”: “2019-04-15/2019-06-15”
},
{
“type”: “local”,
“baseDir”: “/apps/”,
“filter”: “test.txt”
}
]
},
“appendToExisting”: false
},
“tuningConfig”: {
“targetPartitionSize”: 5000000,
“type”: “index”,
“maxRowsInMemory”: 39999,
“forceExtendableShardSpecs” : true
}
}
}

Below exception identified when we submit task.

java.lang.NullPointerException: taskToolbox is not set
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:229) ~[guava-16.0.1.jar:?]
at io.druid.indexing.firehose.IngestSegmentFirehoseFactory.connect(IngestSegmentFirehoseFactory.java:133) ~[druid-indexing-service-0.12.3.jar:0.12.3]
at io.druid.segment.realtime.firehose.CombiningFirehoseFactory$CombiningFirehose.nextFirehose(CombiningFirehoseFactory.java:91) ~[druid-server-0.12.3.jar:0.12.3]
at io.druid.segment.realtime.firehose.CombiningFirehoseFactory$CombiningFirehose.(CombiningFirehoseFactory.java:80) ~[druid-server-0.12.3.jar:0.12.3]
at io.druid.segment.realtime.firehose.CombiningFirehoseFactory.connect(CombiningFirehoseFactory.java:59) ~[druid-server-0.12.3.jar:0.12.3]
at io.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:655) ~[druid-indexing-service-0.12.3.jar:0.12.3]
at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:264) ~[druid-indexing-service-0.12.3.jar:0.12.3]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:444) [druid-indexing-service-0.12.3.jar:0.12.3]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:416) [druid-indexing-service-0.12.3.jar:0.12.3]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]