I am experiencing a weird issue with one of my real time pipeline involving Druid.
I am using Kafka + Storm + Druid as a real time pipeline. The process is simple, I am parsing logs coming from Kafka in a Storm topology and send a set of dimensions and metrics directly to Druid using tranquility (No aggregation performed with Storm).
For some reason, we ended up with a crazy number for our revenue metric on the hour 20 today (1,435,636,203,520.00$ which is obviously wrong). We have one replica for this real time task and our index and segment granularity is set to one hour.
I’m guessing the problem came from one of the middle manager aggregating events coming from Storm. Now I don’t see anything useful in my middle managers logs which could help me troubleshooting this issue.
Do you have any recommendation or any idea on what could have happened? Have you ever heard about such a problem before? Is it possible to point out a possible issue on middle manager aggregation with Druid logs?
I have attached my real time task specifications, I am using Druid 0.7.0. Hope we can find out what happened as Druid is becoming very critical for our real time data.
real_time_task.json (6.57 KB)