After having gone through the quickstart, I am now trying to get load my own data loaded…
When I run it,
curl -X ‘POST’ -H ‘Content-Type:application/json’ -d @/home/dhopkins/druid-messages-index.json http://localhost:8090/druid/indexer/v1/task
I see in the console success, and in the log I see success…
The bolded line below…looks suspect?..nothing to publish? does this mean…it is not finding any records in the data file?
Is my ISO time stamp format an issue? it doesn’t have milliseconds?
Also I noticed the quickstart example has json fields explicitly when values are null, is that needed?..
i.e. if a column is not present in the data what happens?
if a column is present that is not defined in the spec…what happens?
019-02-04T19:11:20,010 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Pushing segments in background:
2019-02-04T19:11:20,010 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Submitting persist runnable for dataSource[messages_index]
2019-02-04T19:11:20,018 INFO [publish-0] io.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Dropping segments[]
2019-02-04T19:11:20,024 INFO [task-runner-0-priority-0] io.druid.indexing.common.task.IndexTask - Pushed segments[]
2019-02-04T19:11:20,026 INFO [publish-0] io.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Nothing to publish, skipping publish step.
2019-02-04T19:11:20,027 INFO [task-runner-0-priority-0] io.druid.indexing.common.task.IndexTask - Published segments
2019-02-04T19:11:20,027 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting down…
2019-02-04T19:11:20,029 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_messages_index_2019-02-04T19:10:15.763Z] status changed to [SUCCESS].
2019-02-04T19:11:20,032 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
“id” : “index_messages_index_2019-02-04T19:10:15.763Z”,
“status” : “SUCCESS”,
“duration” : 308
}
``
The ‘druid cluster’ console:
http://wdc-tst-bdrd-001.openmarket.com:8081/#/
does not show my datasource (it does show the one from the quickstart)
The overlord console…(which shows the tasks/etc)
http://wdc-tst-bdrd-001.openmarket.com:8090/console.html
does in fact show my datasource, and that the task succeeded
Here is my spec:
{
“type”:“index”,
“spec” : {
“dataSchema”:{
“dataSource”:“messages_index”,
“parser”:{
“type”:“string”,
“parseSpec”:{
“format”:“json”,
“dimensionsSpec”:{
“dimensions”:[
“acceptedDate”,
“acceptedDateBucket”,
“deliveredDate”,
“updatedDate”,
“accountName”,
“accountId”,
“companyId”,
“countryName”,
“countryCode”,
“carrierName”,
“caId”,
“messageType”,
“messageOriginator”,
“messageOriginatorTon”,
“phoneNumber”,
“sourceAddress”,
“destinationAddress”,
“productName”,
“productId”,
“productIdDescription”,
“subaccount”,
“userDefined1”,
“userDefined2”,
“messageStatus”,
“responseCode”,
“responseCodeDescription”,
“messageId”,
“parentMessageId”,
“livup”,
“apiVersion”,
“contentEncoding”,
“userDataHeader”,
“remoteIpAddress”,
“remoteResponseCode”,
“userAgent”,
“productSubType”,
“pId”,
“internalMessageId”,
“Score”
]
},
“timestampSpec”:{
“column”:“acceptedDateBucket”,
“format”:“iso”
}
}
},
“metricsSpec”:,
“granularitySpec”:{
“rollup”:false,
“segmentGranularity”:“MINUTE”,
“queryGranularity”:“MINUTE”
}
},
“ioConfig” : {
“type” : “index”,
“firehose” : {
“type” : “local”,
“baseDir” : “/home/dhopkins/”,
“filter” : “kafka-file-dump.json”
},
“appendToExisting” : false
},
“tuningConfig”:{
“type” : “index”,
“targetPartitionSize” : 5000000,
“maxRowsInMemory” : 25000,
“forceExtendableShardSpecs” : true
}
}
}
``
Here is sample data record:
{“destinationAddress”:“111111111”,
“remoteResponseCode”:"",
“accountName”:“11111112F27-11111-A447-7D5433A69CF5”,
“updatedDate”:“2019-02-03T18:43:33Z”,
“messageOriginatorTon”:“1”,
“responseCodeDescription”:“Message delivered”,
“acceptedDate”:“2019-02-03T18:43:21Z”,
“productSubType”:“TRANSACTIONAL”,
“productName”:“111 Way”,
“responseCode”:“4”,
“deliveredDate”:“2019-02-03T18:43:33Z”,
“Score”:"[score20m=0,
score10m=0,
score24h=0]",
“apiVersion”:“VERSION_4”,
“carrierName”:“V11o”,
“messageType”:“MT”,
“countryCode”:“BR”,
“messageOriginator”:“11111111”,
“parentMessageId”:"",
“contentEncoding”:“UTF-8”,
“productIdDescription”:“CXsfd)”,
“remoteIpAddress”:“11.11.11.444”,
“sourceAddress”:“1111111”,
“productId”:“133”,
“subaccount”:“SSDFDFICE”,
“messageId”:“1119Z-0203T-1843Q-2137S”,
“userAgent”:“V4HTTP”,
“messageStatus”:“Delivered”,
“accountId”:“112-11111”,
“internalMessageId”:“11111-11111-11111-2137S”,
“companyId”:“000-000-00000-00000”,
“phoneNumber”:“1111111111111”,
“userDataHeader”:"",
“userDefined2”:“fcbdf3de-96c1-42d9-96b5-8c92c3c8b1a7”,
“countryName”:“Brazil”,
“caId”:“111”,
“livup”:“false”,
“userDefined1”:“InvokeRId=111-11111-11111-IF20L-4R824-PSI”,
“pId”:""}
``