Indexing task failing

Hi all,

I have trying to inject data into druid and it failed with the below stack trace:

018-08-10T14:11:03,600 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[IndexTask{id=index_tpch_lineitem_small_2018-08-10T14:10:56.913Z, type=index, dataSource=tpch_lineitem_small}]
java.lang.NullPointerException
	at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:176) ~[druid-indexing-service-0.10.1.jar:0.10.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.10.1.jar:0.10.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.10.1.jar:0.10.1]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_144]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_144]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_144]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
2018-08-10T14:11:03,606 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_tpch_lineitem_small_2018-08-10T14:10:56.913Z] status changed to [FAILED].
2018-08-10T14:11:03,611 INFO [main] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_tpch_lineitem_small_2018-08-10T14:10:56.913Z",
  "status" : "FAILED",
  "duration" : 8
}
2018-08-10T14:11:03,616 INFO [main] org.eclipse.jetty.server.Server - jetty-9.3.19.v20170502


**Indexing Spec:**


{
  "type": "index",
  "spec": {
    "dataSchema": {
      "dataSource": "tpch_lineitem_small",
      "parser": {
        "parseSpec": {
          "timestampSpec": {
            "column": "l_shipdate",
            "format": "yyyy-MM-dd"
          },
          "dataSpec": {
            "format": "tsv",
            "delimiter": "|",
            "columns": [
              "l_orderkey",
              "l_partkey",
              "l_suppkey",
              "l_linenumber",
              "l_quantity",
              "l_extendedprice",
              "l_discount",
              "l_tax",
              "l_returnflag",
              "l_linestatus",
              "l_shipdate",
              "l_commitdate",
              "l_receiptdate",
              "l_shipinstruct",
              "l_shipmode",
              "l_comment"
            ],
            "dimensions": [
              "l_orderkey",
              "l_partkey",
              "l_suppkey",
              "l_linenumber",
              "l_returnflag",
              "l_linestatus",
              "l_shipdate",
              "l_commitdate",
              "l_receiptdate",
              "l_shipinstruct",
              "l_shipmode",
              "l_comment"
            ]
          },
          "granularitySpec": {
            "type": "arbitrary",
            "intervals": [
              "1980/2020"
            ]
          },
          "ioConfig": {
            "type": "index",
            "firehose": {
              "type": "static",
              "paths": "/indexfiles/lineitem.tbl.gz"
            }
          },
          "rollupSpec": {
            "aggs": [
              {
                "type": "count",
                "name": "count"
              },
              {
                "type": "longSum",
                "fieldName": "L_QUANTITY",
                "name": "L_QUANTITY"
              },
              {
                "type": "doubleSum",
                "fieldName": "L_EXTENDEDPRICE",
                "name": "L_EXTENDEDPRICE"
              },
              {
                "type": "doubleSum",
                "fieldName": "L_DISCOUNT",
                "name": "L_DISCOUNT"
              },
              {
                "type": "doubleSum",
                "fieldName": "L_TAX",
                "name": "L_TAX"
              }
            ],
            "rollupGranularity": "day"
          }
        }
      }
    }
  }
}

Please let me know if there are any changes to be made to get it working.

Thanks


Hello,

The properties for parseSpec(format, delimiter, columns etc) need not be nested inside dataSpec. Please refer to http://druid.io/docs/0.10.1/ingestion/index.html#tsv-delimited-parsespec for guidance on how to define the properties. Also you could define the aggregators under metricsSpec property and the rollup granularity within the property queryGranularity. You can find an example ingestion spec here: http://druid.io/docs/0.10.1/ingestion/index.html#dataschema. Please use this as reference and modify your ingestion spec accordingly.

Thanks,

Atul

I had to tinker with the Schema a little more but finally got it working. Thank you

Below is the sample spec that was working for me just in case someone needs it

{

“type”: “index”,

“spec”: {

“dataSchema”: {

“dataSource”: “tpch_lineitem_small”,

“parser”: {

“parseSpec”: {

“format”: “tsv”,

“delimiter”: “|”,

“columns”: [

“l_orderkey”,

“l_partkey”,

“l_suppkey”,

“l_linenumber”,

“l_quantity”,

“l_extendedprice”,

“l_discount”,

“l_tax”,

“l_returnflag”,

“l_linestatus”,

“l_shipdate”,

“l_commitdate”,

“l_receiptdate”,

“l_shipinstruct”,

“l_shipmode”,

“l_comment”

],

“timestampSpec”: {

“column”: “l_shipdate”,

“format”: “yyyy-MM-dd”

},

“dimensionsSpec”: {

“dimensions”: [

“l_orderkey”,

“l_partkey”,

“l_suppkey”,

“l_linenumber”,

“l_returnflag”,

“l_linestatus”,

“l_shipdate”,

“l_commitdate”,

“l_receiptdate”,

“l_shipinstruct”,

“l_shipmode”,

“l_comment”

]

}

}

},

“granularitySpec”: {

“type”: “arbitrary”,

“queryGranularity”: “DAY”,

“intervals”: [

“1980/2020”

]

},

“metricsSpec”: [

{

“type”: “count”,

“name”: “count”

},

{

“type”: “longSum”,

“fieldName”: “L_QUANTITY”,

“name”: “L_QUANTITY”

},

{

“type”: “doubleSum”,

“fieldName”: “L_EXTENDEDPRICE”,

“name”: “L_EXTENDEDPRICE”

},

{

“type”: “doubleSum”,

“fieldName”: “L_DISCOUNT”,

“name”: “L_DISCOUNT”

},

{

“type”: “doubleSum”,

“fieldName”: “L_TAX”,

“name”: “L_TAX”

},

{

“type”: “hyperUnique”,

“fieldName”: “L_SHIPMODE”,

“name”: “L_SHIPMODE”

}

]

},

“ioConfig”: {

“type”: “index”,

“firehose”: {

“type”: “local”,

“filter”: “lineitem.tbl.gz”,

“baseDir”: “/druid/current/indexfiles”

}

}

}

}