Null timestamp in input but the timestamp is there

Hi all!
I’m getting a ParseException while trying to ingest Kakfa events that look like:
{“timestamp”:“2018-03-29T23:31:26.077Z”,“idRequest”:3753192267374775741,“codAggregator”:“DM”,“result”:“OK”,“stackTrace”:“LVS-5345”}
The exceptions clearly shows that the timestamp field has not been sent to tranquility kakfa, but why? the timestamp is well json formatted as it has been successfully processed by kafka.
I’m using Druid 0.12.0 and Tranquility 0.8.0, the exception is:
com.metamx.common.parsers.ParseException: Unparseable timestamp found!
at io.druid.data.input.impl.MapInputRowParser.parse(MapInputRowParser.java:72)
at io.druid.data.input.impl.StringInputRowParser.parseMap(StringInputRowParser.java:136)
at io.druid.data.input.impl.StringInputRowParser.parse(StringInputRowParser.java:74)
at io.druid.data.input.impl.StringInputRowParser.parse(StringInputRowParser.java:37)
at com.metamx.tranquility.druid.DruidBeams$$anonfun$1$$anonfun$7.apply(DruidBeams.scala:177)
at com.metamx.tranquility.druid.DruidBeams$$anonfun$1$$anonfun$7.apply(DruidBeams.scala:177)
at com.metamx.tranquility.druid.DruidBeams$$anonfun$1$$anonfun$apply$1.apply(DruidBeams.scala:195)
at com.metamx.tranquility.druid.DruidBeams$$anonfun$1$$anonfun$apply$1.apply(DruidBeams.scala:195)
at com.metamx.tranquility.beam.TransformingBeam$$anonfun$sendAll$2$$anonfun$2.apply(TransformingBeam.scala:36)
at com.twitter.util.Try$.apply(Try.scala:13)
at com.metamx.tranquility.beam.TransformingBeam$$anonfun$sendAll$2.apply(TransformingBeam.scala:36)
at com.metamx.tranquility.beam.TransformingBeam$$anonfun$sendAll$2.apply(TransformingBeam.scala:35)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:778)
at scala.collection.Iterator$class.foreach(Iterator.scala:742)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:777)
at com.metamx.tranquility.beam.TransformingBeam.sendAll(TransformingBeam.scala:35)
at com.metamx.tranquility.tranquilizer.Tranquilizer.com$metamx$tranquility$tranquilizer$Tranquilizer$$sendBuffer(Tranquilizer.scala:301)
at com.metamx.tranquility.tranquilizer.Tranquilizer$$anonfun$send$1.apply(Tranquilizer.scala:202)
at com.metamx.tranquility.tranquilizer.Tranquilizer$$anonfun$send$1.apply(Tranquilizer.scala:202)
at scala.Option.foreach(Option.scala:257)
at com.metamx.tranquility.tranquilizer.Tranquilizer.send(Tranquilizer.scala:202)
at com.metamx.tranquility.kafka.writer.TranquilityEventWriter.send(TranquilityEventWriter.java:76)
at com.metamx.tranquility.kafka.KafkaConsumer$2.run(KafkaConsumer.java:231)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException: ** Null timestamp in input:** {idRequest=8643556674707087331, codAggregator=DM, result=KO, stackTrace=Dummy…
at io.druid.data.input.impl.MapInputRowParser.parse(MapInputRowParser.java:63)
… 30 more

and the .json configuration file looks like:

{
“dataSources” : {
“aggregator-kafka” : {
“spec” : {
“dataSchema” : {
“dataSource” : “aggregator-kafka”,
“parser” : {
“type” : “string”,
“parseSpec” : {
“timestampSpec” : {“format” : “auto”, “column” : “timestamp”},
“dimensionsSpec” : {
“dimensions” : [“timestamp”,“codAggregator”, “result”, “stackTrace”],
“dimensionExclusions” : [
“idRequest”
]
},
“format” : “json”
}
},
“granularitySpec” : {
“type” : “uniform”,
“segmentGranularity” : “minute”,
“queryGranularity” : “none”
},
“metricsSpec” : [
{
“type” : “count”,
“name” : “count”
}
]
},
“ioConfig” : {
“type” : “realtime”
},
“tuningConfig” : {
“type” : “realtime”,
“maxRowsInMemory” : “100000”,
“intermediatePersistPeriod” : “PT10M”,
“windowPeriod” : “PT10M”
}
},
“properties” : {
“task.partitions” : “1”,
“task.replicants” : “1”,
“topicPattern” : “aggregator-out”
}
}
},
“properties” : {
“zookeeper.connect” : “localhost”,
“druid.discovery.curator.path” : “/druid/discovery”,
“druid.selectors.indexing.serviceName” : “druid/overlord”,
“commit.periodMillis” : “15000”,
“consumer.numThreads” : “2”,
“kafka.zookeeper.connect” : “localhost”,
“kafka.group.id” : “tranquility-kafka”
}
}

Thanks in advance!

E

can you make sure that the Druid JVM is running with UTC timezone ? -Duser.timezone=UTC

I’m using the default /conf-quickstart configuration shipped with the distribution, in whole JVM I can see:

-Duser.timezone=UTC

Trying to solve the problem first injecting events directly through:

my configuration file for json-data-generator looks like:

  "type": "tranquility",
  "zookeeper.host": "localhost",
  "zookeeper.port": 2181,
  "overlord.name":"overlord",
  "firehose.pattern":"druid:firehose:%s",
  "discovery.path":"/druid/discovery",
  "datasource.name":"aggregatest",
  "timestamp.name":"eventTimestamp",
  "sync": true       

while overlord starts with:

java cat conf-quickstart/druid/overlord/jvm.config | xargs -cp “conf-quickstart/druid/_common:conf-quickstart/druid/overlord:lib/*” io.druid.cli.Main server overlord

But not able to inject nothing as I’m getting:

2018-04-05 22:49:36,312 ERROR n.a.d.j.g.l.TranquilityLogger [Thread-1] Error sending event to Druid
java.lang.IllegalStateException: Failed to save new beam for identifier[overlord/aggregatest] timestamp[2018-04-05T22:00:00.000+02:00]
at com.metamx.tranquility.beam.ClusteredBeam$$anonfun$2.applyOrElse(ClusteredBeam.scala:264) ~[tranquility_2.10-0.4.2.jar:0.4.2]
at com.metamx.tranquility.beam.ClusteredBeam$$anonfun$2.applyOrElse(ClusteredBeam.scala:261) ~[tranquility_2.10-0.4.2.jar:0.4.2]
at com.twitter.util.Future$$anonfun$rescue$1.apply(Future.scala:843) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.util.Future$$anonfun$rescue$1.apply(Future.scala:841) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.util.Promise$Transformer.liftedTree1$1(Promise.scala:100) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.util.Promise$Transformer.k(Promise.scala:100) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.util.Promise$Transformer.apply(Promise.scala:110) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.util.Promise$Transformer.apply(Promise.scala:91) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.util.Promise$$anon$2.run(Promise.scala:345) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.concurrent.LocalScheduler$Activation.run(Scheduler.scala:186) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.concurrent.LocalScheduler$Activation.submit(Scheduler.scala:157) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.concurrent.LocalScheduler.submit(Scheduler.scala:212) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.concurrent.Scheduler$.submit(Scheduler.scala:86) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.util.Promise.runq(Promise.scala:331) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.util.Promise.updateIfEmpty(Promise.scala:642) ~[util-core_2.10-6.23.0.jar:6.23.0]
at com.twitter.util.ExecutorServiceFuturePool$$anon$2.run(FuturePool.scala:112) ~[util-core_2.10-6.23.0.jar:6.23.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_20]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_20]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_20]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_20]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_20]
Caused by: com.twitter.finagle.NoBrokersAvailableException: No hosts are available for overlord
at com.twitter.finagle.NoStacktrace(Unknown Source) ~[?:?]

Thanks in advance…

You should probably check logs during startup of druid, one of the service is failing to start.

I have found that all the jvm.config files under conf-quickstart/druid/ folders of broker, coordinator, historical, middleManager, overlord had a wrong -Duser.timezone. Now I have set all of them to the proper value. To make a simplier test, I have avoided to feed druid via Kafka and I have just used json-data-generator-1.3.0 whose config.json looks like:


“producers”: [
{
“type”: “tranquility”,
“zookeeper.host”: “localhost”,
“zookeeper.port”: 2181,
“overlord.name”:“overlord”,
“firehose.pattern”:“druid:firehose:%s”,
“discovery.path”:"/druid/discovery",
“datasource.name”:“aggregatest”,
“timestamp.name”:“eventTimestamp”,
“sync”: true
},

I started the json generator to feed Druid, but still not able to see nothing in the datasources of the Druid Console nor evident exceptions in the logs.

Where I should start from to understand where is the problem and why the ingestion of events doesn’t work?

Thanks in advance…

I would recommend double checking your input data, it's possible that some
rows have missing or malformed timestamps.

e.g.,
`Caused by: java.lang.NullPointerException:* Null timestamp in input:*
{idRequest=8643556674707087331, codAggregator=DM, result=KO,
stackTrace=Dummy...`

I would look for the specific row in your input that has those dimension
values.

I started the json generator to feed Druid, but still not able to see nothing in the datasources of the Druid Console nor evident exceptions in the logs. Where I should start from to understand where is the problem and why the ingestion of events doesn’t work?

I would start by checking the Overlord console in your browser (port 8090 by default) to verify that a task was submitted, ran, and successfully completed.

Hi all!
I have checked the logs and I have not any of such exceptions:
e.g.,
Caused by: java.lang.NullPointerException: Null timestamp in input: {idRequest=8643556674707087331, codAggregator=DM, result=KO, stackTrace=Dummy...
Instead I have found that in the Druid cooordinator Console (http://localhost:8090/console.html), there is an entry also before I have been starting sending events with the json-data-generator. There is a single row that looks like:
worker scheme => http
worker host => myHostName:8091
worker ip => myHostName
worker capacity => 3
worker version => 0
currCapacityUsed => 0
availabilityGroups =>
runningTasks =>
lastCompletedTaskTime => 2018-04-22T20:10:50.414Z
blacklistedUntil => null
and the task time however shows that it seems the coordinator have not read the jvm.config file under druid-0.12.0/conf-quickstart/druid/coordinator where it is set as:
-Duser.timezone=Europe/Rome

Any ideas?

TIA!

I am having exact same issue, did anyone fix it or resolve that error ?

Check if your dimension name and timestamp column name matches exactly.

Faced a similar issue while ingesting CSV file, the timestamp issue could be because of BOM (byte order mark)

Looks like the below:


For my case the timestamp is the first column and this BOM get appended to the timestamp field.

Try how we can remove BOM from ur file.

I opened the CSV in notepad++ and changed the encoding from UTF-8-BOM to UTF-8 and the issue got resolved.

Hope this helps!