Trying to index the batch data using the event-receiver firehose

Hi,

If I’m using the “local” firehose, it’s working fine – the batch indexing takes place. But the thing is I’m trying to use the “receiver” firehose because in production we have different machines for different nodes.

Following the grammar for the task - which is getting accepted (without any error)

{

“type” : “index”,

“spec” : {

"dataSchema" : {

  "dataSource" : "mytable",

  "parser" : {

    "type" : "map",

    "parseSpec" : {

      "format" : "json",

      "timestampSpec" : {

        "column" : "time",

        "format" : "auto"

      },

      "dimensionsSpec" : {

        "dimensions": ["dim1"],

        "dimensionExclusions" : [],

        "spatialDimensions" : []

      }

    }

  },

  "metricsSpec" : [

    {

      "type" : "count",

      "name" : "count"

    },

    {

      "type" : "doubleSum",

      "name" : "m1",

      "fieldName" : "m1"

    },

    {

      "type" : "doubleSum",

      "name" : "m2",

      "fieldName" : "m2"

    }

  ],

  "granularitySpec" : {

    "type" : "uniform",

    "segmentGranularity" : "DAY",

    "queryGranularity" : "DAY",

    "intervals" : [ "2016-05-30/2016-06-03" ]

  }

},

"ioConfig" : {

  "type" : "index",

  "firehose" : {

    "type" : "receiver",

    "serviceName" : "myServiceName",

    "bufferSize" : 10000

  }

},

"tuningConfig" : {

  "type" : "index",

  "targetPartitionSize" : -1,

  "rowFlushBoundary" : 0,

  "numShards": 1

}

}

}

``

Following is the command to submit this task (if the above is saved as ‘my-task.json’ and 9087 is the Overlord port.

curl -X ‘POST’ -H ‘Content-Type:application/json’ -d @my-task.json localhost:9087/druid/indexer/v1/task

``

Middle manager is running on port 9089. Trying submit the event data (

curl -X ‘POST’ -H ‘Content-Type:application/json’ -d @myevent_data.json localhost:9089/druid/worker/v1/chat/myServiceName/push-events

``

Also, tried the same thing with port 8100 (not sure if this is the peon port)?

The data is as following in myevent_data.json

[{“time”:“2016-06-02T00:00:00.000Z”, “dim1”: “dimval1”, “m1”:“20”, “m2”: “400”}]

``

Can anybody please help - what am I doing wrong? Or any tip?

Thanks!!

Hi Salil, for batch indexing you should try reading http://druid.io/docs/0.9.0/tutorials/tutorial-batch.html

You can’t use the event receiver firehose to load historical data.