Kafka Ingestion Timestamp

Hi everyone.

I am new to Druid and have set up a basic ingestion from my Kafka topics. My payloads of the Kafka messages do not have a timestamp.
Is it currently possible to use the current time of ingestion as a timestamp?

Thanks in advance!

Hi abaschkim,

Welcome to the druid forum, you can set a default with something like this

“missingValue”: “2022-01-01T00:00:00Z”

in the timestampSpec as mentioned here: Ingestion · Apache Druid

but I suspect you want to add the ingestion time per record correct?

Yes, that is what I am currently doing, and I was hoping there was something like ‘current_time’.

We specifically avoid this functionality on purpose because it reduces rollup and also has other potential issues.

Please read through this discussion for more information if needed:

Credit to @Ben_Krug who pointed out the following:

As of recently, I believe you can get event header info like timestamp - see Apache Kafka ingestion · 2022.02. not sure if it can directly be used for timestamp, or if you’ll need to use a transform for __time.

docs.imply.iodocs.imply.io

Apache Kafka ingestion · 2022.02

Overview of the Kafka indexing service for Druid. Includes example supervisor specs to help you get started.

You can use the Kafka message timestamp, which is not the same as the ingestion timestamp. Also, this functionality is currently only available in Imply’s version of Druid, it has not been included in the open source release. (You may be able to use this if you build your own Apache Druid from the latest snapshot on Github, though.)

3 Likes

Thanks for pointing that out Hellmar, I seemed to have missed that detail.