TimestampSpec for fractional (epoch + miliseconds) time

Hello,

I am experimenting with Druid and I have files where the timestamp is expressed like seconds.fractional_seconds. Example: 1487249458.633

How do I specify my TimestampSpec in the Ingestion Spec to deal with this? auto yields Caused by: com.metamx.common.parsers.ParseException: Unparseable timestamp found! errors in the logs.

Thanks.

-William

Hi druid supports only

iso, millis, posix or any Joda time format.

Auto means it will try to solve any of the accepted types posted above.

Seems like you time is not a standard that druid can read so you need to convert it before hand or plugin your own parser spec ParseSpec.java

I’m trying to understand what Posix actually does. Looking at github it seems that this should be:
return new DateTime(input.longValue() * 1000);

which in my case would be DateTime(1487249458.633 * 1000). That seems like it should work, but I still get the following error:

Caused by: com.metamx.common.parsers.ParseException: Unparseable timestamp found!
	at io.druid.data.input.impl.MapInputRowParser.parse(MapInputRowParser.java:72) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.data.input.impl.StringInputRowParser.parseMap(StringInputRowParser.java:136) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.data.input.impl.StringInputRowParser.parse(StringInputRowParser.java:131) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopyStringInputRowParser.parse(HadoopyStringInputRowParser.java:48) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopDruidIndexerMapper.parseInputRow(HadoopDruidIndexerMapper.java:105) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopDruidIndexerMapper.map(HadoopDruidIndexerMapper.java:72) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.DetermineHashedPartitionsJob$DetermineCardinalityMapper.run(DetermineHashedPartitionsJob.java:285) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_73]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_73]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_73]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_73]
	at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_73]
Caused by: java.lang.NumberFormatException: For input string: "1487249458.633"
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_73]
	at java.lang.Long.parseLong(Long.java:589) ~[?:1.8.0_73]
	at java.lang.Long.parseLong(Long.java:631) ~[?:1.8.0_73]
	at com.metamx.common.parsers.TimestampParser$3.apply(TimestampParser.java:73) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.parsers.TimestampParser$3.apply(TimestampParser.java:68) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.parsers.TimestampParser$9.apply(TimestampParser.java:159) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.parsers.TimestampParser$9.apply(TimestampParser.java:150) ~[java-util-0.27.10.jar:?]
	at io.druid.data.input.impl.TimestampSpec.extractTimestamp(TimestampSpec.java:97) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.data.input.impl.MapInputRowParser.parse(MapInputRowParser.java:60) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.data.input.impl.StringInputRowParser.parseMap(StringInputRowParser.java:136) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.data.input.impl.StringInputRowParser.parse(StringInputRowParser.java:131) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopyStringInputRowParser.parse(HadoopyStringInputRowParser.java:48) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopDruidIndexerMapper.parseInputRow(HadoopDruidIndexerMapper.java:105) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopDruidIndexerMapper.map(HadoopDruidIndexerMapper.java:72) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.DetermineHashedPartitionsJob$DetermineCardinalityMapper.run(DetermineHashedPartitionsJob.java:285) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_73]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_73]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_73]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_73]
	at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_73]

Why is posix the wrong choice in this case?

Thanks.

-William

as you can see it is trying to pars long which fails due to “.”

Caused by: java.lang.NumberFormatException: For input string: "1487249458.633"
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_73]
	at java.lang.Long.parseLong(Long.java:589) ~[?:1.8.0_73]
	at java.lang.Long.parseLong(Long.java:631) ~[?:1.8.0_73]

thought druid can read this as number but not sure why is considered as a string.

Can you please file an issue and will look at it ASAP.

Also please upload the job spec and a sample of the data like that we can reproduce/Unit test it.

Thanks

Thanks B-Slim. Issue submitted here: https://github.com/druid-io/druid/issues/3952

Let me know if I left out any important details.

-William

It’s been a long time since I’ve done any Java development.

Is there a tutorial for creating a custom parser spec?

Thanks.

-William

Maybe avro reader http://druid.io/docs/latest/development/extensions-core/avro.html it does more than what you need but it is a good example.

I think you should use ruby, https://github.com/metamx/java-util/blob/master/src/main/java/com/metamx/common/parsers/TimestampParser.java#L84

在 2017年2月18日星期六 UTC+8上午1:23:53,William Cox写道:

Hello kaijian,

It appears that ruby does work as a timestampSpec! That’s awesome and will save me a lot of time.

Thanks.

-William