[druid-user] Re: Druid ingest batching data from Kafka indexing service

Do you have any roll ups defined in your supervisor spec?

Rommel Garcia
Director, Field Engineering
rommel.garcia@imply.io
404.502.9672

I have defined some dimensions and a metric type count, and the value of rollup is true.

在 2019年1月13日星期日 UTC+8上午9:42:10,Rommel Garcia写道:

attach my druid ingest json file. can you help me to analyze what’s wrong?

在 2019年1月13日星期日 UTC+8下午4:35:01,赵升阳写道:

druid-Kafka_Unknown.json (1.28 KB)

Hi:

According to the doc , indexing tasks read events using Kafka's own partition and offset mechanism , so the multi rows in one Kafka is still considered as one offset. When Druid saw you have more data in one Kafka record than the columns defined in your ingestion spec, it omitted the trailing data. You can confirm if you only see xxxx in query output after ingestion.

BTW, in Druid, we call ingesting Kafka as stream ingestion. Batch ingestion is ingesting none real time data, which is not the case here. http://druid.io/docs/latest/ingestion/stream-ingestion.html

Ming