Relates to Apache Druid 0.20.1
We have segments from three days missing from the query results. The missing segments are shown in the web console and also in sys.segments table. The segments ‘used’ flags are set to True in the metadata db store. Also, we can see segements in the web console. The segments were also pulled from deep storage (S3) to the local segmentCache on historical nodes. But the data from the segments didn’t return from Druid query. And the ‘segmentMetadata’ query returned empty for the segments. We didn’t find any meaningful logs related to the issue. Does anyone know what’s going on here?
More background on this issue:
- We were using Kafka to ingest the data in to Druid
- Then we reindexed data for 3 days as we found incorrect data was ingested. To reindex data, we have marked all segments as unused for those 3 days and then ran kill job to permanently remove those segments.
- Then we reingested correct data through Kafka topic.
- Then we saw missing segments from query after those segments were re-ingested
Things I’ve tried
- Querying for rows in a given date time range
SELECT * FROM gps WHERE __time BETWEEN '2021-09-06' AND '2021-09-07'
Response: Query returned no data
- The historical node didn’t return the segments, the query was sent to historical node directly:
{
"queryType":"segmentMetadata",
"dataSource":"gps",
"intervals":["2021-09-06/2021-09-08"]
}
and it returned empty results.
-
Applied drop retention rule for those 3 days and then reset drop rule and again reloaded data. Still we see no results on querying though the segments are loaded.
-
Manually compacted segments for those 3 days, still no results on querying.
Assumptions
- Could it be caused by segment version conflict?
- Could it be with zookeeper/historical segment discovery issue?
Please help us here, this a critical production issue for us.
Thank you in advance!