We are using Druid 0.10.1. We had segments from two days missing from the query. The missing segments are shown from the coordinator console. The segments ‘used’ flags are set to True in the metadata db store. The segments were also pulled from deep storage (S3) to the local segmentCache on historical nodes. But the data from the segments didn’t return from Druid query. And the ‘segmentMetadata’ query returned empty for the segments. We didn’t find any meaningful logs related to the issue. Does anyone know what’s going on here?
A little more background of this issue:
We were using Kafka to ingest the data to Druid
Then we reindexed those segments using native ‘index’ type of ‘ingestSegment’
Then we found incorrect data of the segments, it’s not caused by ingestion but our application logic issue. So we dropped those segments
Then we re-ingest those segments using Kafka after the segments were dropped
The we saw missing segments from query after those segments were re-ingested
Could it be caused by segment version conflict?
Would like to check if anyone knows what’s going on here? Thanks.
Not sure how helpful this is but if the coordinator see these segments and the historicals has loaded them, maybe something weird’s happening with the broker’s cache of cluster state? You could try modifying the query to specify the exact interval covered by the segment and issue a query directly at an historical to see if this is the case.
If that query returns the data, restarting the brokers and monitoring their logs for issues would be a good way of figuring out what’s going on.
Yeah the weird thing is the historical node didn’t return the segments, the query was sent to historical node directly:
and it returned empty results.
So the coordinator listed the segments, but the historical node didn’t return the segments.