We are sending two different JSON formatted message types to the same Kafka topic and have two ingestion tasks running on Druid side to two create two separate data sources. We noticed that some of key fields in the data stores were missing so we started to investigate.
Basically, we have A and B type of messages sent to the same Kafka topic. We enabled A and stopped producing B to the topic. Following is a sample A message:
{
"fields": {
"cpu_usage_process/avg_run_time": 0,
"cpu_usage_process/five_minutes": 0,
"cpu_usage_process/five_seconds": 0,
"cpu_usage_process/invocation_count": 1,
"cpu_usage_process/name": “Policy bind Process”,
"cpu_usage_process/one_minute": 0,
"cpu_usage_process/pid": 597,
"cpu_usage_process/total_run_time": 0,
"cpu_usage_process/tty": 0,
"five_minutes": 3,
"five_seconds": 3,
"five_seconds_intr": 0,
"one_minute": 3
},
"name": “Cisco-IOS-XE-process-cpu-oper:cpu-usage/cpu-utilization”,
"tags": {
"host": “mtllab1”,
"path": “Cisco-IOS-XE-process-cpu-oper:cpu-usage/cpu-utilization”,
"source": “10.10.10.10”,
"subscription": "12"
},
"timestamp": 1581780294
}
We terminated the task to ingest A messages on Druid and left B enabled on Druid. We noticed that the messages are still partially being ingested to B data store. Somehow the spec file for B matches messages formatted for A. Please find attached a copy of the spec file. Is there something we can do to make ingestion more strict?
Can someone please take a look and let us know what we are missing here.
Thanks,
telemetry.json (3.48 KB)