EventReceiveFirehose setup

When I submitted a task with EventReceivFireHost, I expect a peon is created, ready to accept event, but when I went to the overlord console, the task is in the failed status. I also have difficulty to get the logs for the failed task

Perhaps I did not construct the task json correctly (not any good example json for the matter) when submitting a job to the overlord node. Here is the json (when I used a modified version to ingest local file, the console reported success), also attached. Thanks.

{
“type”: “index_realtime”,
“spec”: {
“dataSchema”: {
“dataSource”: “wikipedia”,
“parser”: {
“type”: “string”,
“parseSpec”: {
“format”: “json”,
“timestampSpec”: {
“column”: “timestamp”,
“format”: “auto”
},
“dimensionsSpec”: {
“dimensions”: [
“page”,
“language”,
“user”,
“unpatrolled”,
“newPage”,
“robot”,
“anonymous”,
“namespace”,
“continent”,
“country”,
“region”,
“city”
],
“dimensionExclusions”: ,
“spatialDimensions”:
}
}
},
“metricsSpec”: [
{
“type”: “count”,
“name”: “count”
},
{
“type”: “doubleSum”,
“name”: “added”,
“fieldName”: “added”
},
{
“type”: “doubleSum”,
“name”: “deleted”,
“fieldName”: “deleted”
},
{
“type”: “doubleSum”,
“name”: “delta”,
“fieldName”: “delta”
}
],
“granularitySpec”: {
“type”: “uniform”,
“segmentGranularity”: “DAY”,
“queryGranularity”: “NONE”,
“intervals”: [“2013-08-31/2013-09-01”]
}
},
“ioConfig”: {
“type”: “realtime”,
“firehose”: {
“type”: “receiver”,
“serviceName”: “wikiIngest”
}
},
“tuningConfig”: {
“type”: “realtime”,
“targetPartitionSize”: 0,
“rowFlushBoundary”: 0
}
}
};

EventReceiver.json (1.63 KB)

Figure out the log spec:

druid.storage.type=local
druid.storage.storage.storageDirectory=/tmp/log

Here is the log. I suspect that may be for the firehose I need to provide parser info but I am not sure what is needed.

2015-04-15T23:59:10,570 ERROR [task-runner-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[RealtimeIndexTask{id=index_realtime_wikipedia_0_2015-04-15T23:59:02.349Z, type=index_realtime, dataSource=wikipedia}]
java.lang.ClassCastException: io.druid.data.input.impl.StringInputRowParser cannot be cast to io.druid.data.input.impl.MapInputRowParser
at io.druid.segment.realtime.firehose.EventReceiverFirehoseFactory.connect(EventReceiverFirehoseFactory.java:53) ~[druid-server-0.7.0.jar:0.7.0] at io.druid.indexing.common.task.RealtimeIndexTask.run(RealtimeIndexTask.java:
153) ~[druid-indexing-service-0.7.0.jar:0.7.0] at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:235) [druid-indexing-service-0.7.0.jar:0.7.0]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:214) [druid-indexing-service-0.7.0.jar:0.7.0] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
145) [?:1.7.0_75]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_75]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_75]2015-04-15T23:59:10,595 INFO [task-runner-0] io.druid.indexing.worker.executor.Executo
rLifecycle - Task completed with status: { “id” : “index_realtime_wikipedia_0_2015-04-15T23:59:02.349Z”, “status” : “FAILED”,
“duration” : 275}

The event receiver firehose needs a “map” parser rather than a “string” parser.

You might find tranquility’s code for setting up indexing tasks useful. It’s here: https://github.com/metamx/tranquility/blob/master/src/main/scala/com/metamx/tranquility/druid/DruidBeamMaker.scala#L79