Overlord publishes Kafka Indexing Service segment with used=0 in metadata

druid release version: 0.10.0

In the data source there was only segments created by KIS at first and everything worked fine.

The data ingested by KIS is mostly linear but sometimes we get new events in the past. In this instance KIS was properly allocating new shard with time interval in the past and the shard was being published by the overlord, coordinator would pick it up. Everything worked smoothly although there were many shards created for each segment interval.

To mitigate the problem with many shards we have reingested the data with the simple indexer tasks e.g:


{

"type": "index",

"spec": {

"dataSchema": {

"dataSource": "foo",

"parser": {

"type": "string",

"parseSpec": {

"format": "json",

"timestampSpec": {

"column": "timestamp",

"format": "auto"

},

"dimensionsSpec": {

"dimensions": [],

"dimensionExclusions": [],

"spatialDimensions": []

}

}

},

"metricsSpec": [

{

"type": "count",

"name": "rows"

}

],

"granularitySpec": {

"type": "uniform",

"segmentGranularity": "DAY",

"queryGranularity": "NONE",

"intervals": ["2017-05-01T00:00:00.000Z/2017-05-12T00:00:00.000Z"]

}

},

"ioConfig": {

"type": "index",

"firehose": {

"type": "ingestSegment",

"dataSource": "foo",

"interval": "2017-05-01T00:00:00.000Z/2017-05-12T00:00:00.000Z"

},

"appendToExisting": false,

"skipFirehoseCaching": false

},

"tuningConfig": {

"type": "index",

"targetPartitionSize": 5000000,

"maxRowsInMemory": 75000,

"buildV9Directly": true,

"forceExtendableShardSpecs": true,

"reportParseExceptions": true

}

}

}

The old segments (2017-05-01/2017-05-20) was reingested and now instead of hundreds shards per segments we have nice and new 1 partition segments.

Everything was fine until the Kafka Indexing Service started running again.

The problem is that everytime KIS is publishing new shard allocated in the time interval that the above index task has reingested segments the following happens:

The overlord allows the new shard to be created, then when the shard is ready overlord publishes it BUT sets used=0 in metadata for the new segment entry in druid_segments instead of used=1.

The effect is that the KIS task waits for the coordinator handoff, but the coordinator does not see the new shard because used=0. Hence the KIS task waits forever.

If I go to SQL console and do update druid_segments set used=1 where id='ID_OF_THE_NEW_SEGMENT_SHARD' the coordinator handoffs the new shard, KIS task ends and everything gets back to normal.

But WHY is the overlord publishing this new shard with used=0 in metadata?

Is this a bug or a feature that can be somehow disabled/changed from overlord settings?

It sounds like you’re hitting https://github.com/druid-io/druid/pull/4257 – you should be OK if you pull that patch back. The fix’ll be part of 0.10.1.

Great, that was it. I’ve applied the patch to 0.10.0 branch and recompiled it, now the segments are published correctly. Thanks a lot!!!