Ingestion missing data

We have an issue in Production where the data is not getting ingested into realtime node though it is present in KAFKA.

The segmentGranularity is ‘hour’ and windowPeriod is default “PT10M”

No error or exception in the indexing-service logs

–>> the created_timestamp marked in RED is the data missing <<–

{“AvgDisk”:“9.832028”,“ENTERPRISE_NAME”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“Memory”:“28.9”,“CPU”:“37.63”,“ELEMENT_NAME”:“10.59.151.1”,“PARENT_SPECID”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“CREATED_TIMESTAMP”:“2018-07-30T09:15:00.000Z”,“SUBELEMENT_NAME”:“f471a2d4-7bcb-4970-bced-9ff7734872b7”,“DISPLAY_NAME”:“es036-ac-r1”,“NAME”:“ne-31798-s60007916-nsg-xxx-eu”}

{“AvgDisk”:“9.832213”,“ENTERPRISE_NAME”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“Memory”:“29.06”,“CPU”:“3.52”,“ELEMENT_NAME”:“10.59.151.1”,“PARENT_SPECID”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“CREATED_TIMESTAMP”:“2018-07-30T09:30:00.000Z”,“SUBELEMENT_NAME”:“f471a2d4-7bcb-4970-bced-9ff7734872b7”,“DISPLAY_NAME”:“es036-ac-r1”,“NAME”:“ne-31798-s60007916-nsg-xxx-eu”}

{“AvgDisk”:“9.832574”,“ENTERPRISE_NAME”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“Memory”:“29.0”,“CPU”:“31.82”,“ELEMENT_NAME”:“10.59.151.1”,“PARENT_SPECID”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“CREATED_TIMESTAMP”:“2018-07-30T09:45:00.000Z”,“SUBELEMENT_NAME”:“f471a2d4-7bcb-4970-bced-9ff7734872b7”,“DISPLAY_NAME”:“es036-ac-r1”,“NAME”:“ne-31798-s60007916-nsg-xxx-eu”}

{“AvgDisk”:“9.832792”,“ENTERPRISE_NAME”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“Memory”:“29.02”,“CPU”:“3.59”,“ELEMENT_NAME”:“10.59.151.1”,“PARENT_SPECID”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“CREATED_TIMESTAMP”:“2018-07-30T10:00:00.000Z”,“SUBELEMENT_NAME”:“f471a2d4-7bcb-4970-bced-9ff7734872b7”,“DISPLAY_NAME”:“es036-ac-r1”,“NAME”:“ne-31798-s60007916-nsg-xxx-eu”}

{“AvgDisk”:“9.83336”,“ENTERPRISE_NAME”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“Memory”:“28.94”,“CPU”:“2.55”,“ELEMENT_NAME”:“10.59.151.1”,“PARENT_SPECID”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“CREATED_TIMESTAMP”:“2018-07-30T10:15:00.000Z”,“SUBELEMENT_NAME”:“f471a2d4-7bcb-4970-bced-9ff7734872b7”,“DISPLAY_NAME”:“es036-ac-r1”,“NAME”:“ne-31798-s60007916-nsg-xxx-eu”}

{“AvgDisk”:“9.83372”,“ENTERPRISE_NAME”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“Memory”:“28.95”,“CPU”:“30.57”,“ELEMENT_NAME”:“10.59.151.1”,“PARENT_SPECID”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“CREATED_TIMESTAMP”:“2018-07-30T10:30:00.000Z”,“SUBELEMENT_NAME”:“f471a2d4-7bcb-4970-bced-9ff7734872b7”,“DISPLAY_NAME”:“es036-ac-r1”,“NAME”:“ne-31798-s60007916-nsg-xxx-eu”}

{“AvgDisk”:“9.833905”,“ENTERPRISE_NAME”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“Memory”:“29.09”,“CPU”:“9.64”,“ELEMENT_NAME”:“10.59.151.1”,“PARENT_SPECID”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“CREATED_TIMESTAMP”:“2018-07-30T10:45:00.000Z”,“SUBELEMENT_NAME”:“f471a2d4-7bcb-4970-bced-9ff7734872b7”,“DISPLAY_NAME”:“es036-ac-r1”,“NAME”:“ne-31798-s60007916-nsg-xxx-eu”}

{“AvgDisk”:“9.834277”,“ENTERPRISE_NAME”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“Memory”:“28.96”,“CPU”:“37.7”,“ELEMENT_NAME”:“10.59.151.1”,“PARENT_SPECID”:“bdbd9597-77a6-403a-bcce-a7b6cb78c862”,“CREATED_TIMESTAMP”:“2018-07-30T11:00:00.000Z”,“SUBELEMENT_NAME”:“f471a2d4-7bcb-4970-bced-9ff7734872b7”,“DISPLAY_NAME”:“es036-ac-r1”,“NAME”:“ne-31798-s60007916-nsg-xxx-eu”}

Could you please help in finding the cause.

Hi DS,

Do you know if those message timestamps were within 10 minutes of the time that you submitted them? One thing to watch out for is future timestamps: Tranquility will also reject things more than windowPeriod into the future.

You could also look into migrating to the Kafka indexing service: http://druid.io/docs/latest/development/extensions-core/kafka-ingestion.html. It does not have a windowPeriod config, and can accept arbitrarily late or early data. It also connects directly from Druid to Kafka without needing an intermediary service, so it’s easy to set up.

Hi Gian,

Thanks for your Input.

We are already using kafka-ingestion service for loading the data into Druid.

I don’t see any offset getting skipped either in the ingestion service logs.

Could you please guide me on exploring any area where i can figure out why data is not being processed.

Cheers!!