New (second) historical node loads segments and deletes them immediately

Hi

i have a cluster running on 0.8 rc1. After some problems with my historical node I added a second one and installed 0.9.1.

We are using HDFS and the setup was pretty smooth.

So I have two historical nodes: hist1 and hist2. Both have among a lot other configuration the following:

druid.segmentCache.locations=[{“path”:"/data/druid_hdfs/indexCache",“maxSize”:130000000000}]

druid.server.maxSize=130000000000

hist1 currently has around 100G in /data/druid_hdfs/indexCache

hist2 is somewhere between 0G and 5G.

What I see now is that the hist1 receives a lot of load events like:

2016-08-18T15:04:28,526 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - New request[LOAD: wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z] with zNode[/druid/loadQueue/XXX.XXX.XXX.XXX:8085/wikidata_2016-08-10T15:02:00.000Z_2016-08-10T15:03:00.000Z_2016-08-10T15:02:00.000Z].

it loads and completes that segment:

2016-08-18T15:04:28,526 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Loading segment wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z

2016-08-18T15:04:28,567 INFO [ZkCoordinator-0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Announcing segment[wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z] at existing path[/druid/segments/XXX.XXX.XXX.XXX:8085/XXX.XXX.XXX.XXX:8085_historical__default_tier_2016-08-18T15:04:25.766+02:00_07766fdbcb464d6fa88ed3950bf2d44710]

2016-08-18T15:04:28,568 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Completed request [LOAD: wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z]

2016-08-18T15:05:29,841 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - New request[DROP: wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z] with zNode[/druid/loadQueue/XXX.XXX.XXX.XXX:8085/wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z].

But a couple of seconds/minutes later, it deletes that segment again, because of a DROP request:

2016-08-18T15:05:29,841 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - New request[DROP: wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z] with zNode[/druid/loadQueue/XXX.XXX.XXX.XXX:8085/wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z].

2016-08-18T15:05:29,842 INFO [ZkCoordinator-0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z] at path[/druid/segments/XXX.XXX.XXX.XXX:8085/XXX.XXX.XXX.XXX:8085_historical__default_tier_2016-08-18T15:04:25.766+02:00_07766fdbcb464d6fa88ed3950bf2d44710]

2016-08-18T15:05:29,843 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Completely removing [wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z] in [30,000] millis

2016-08-18T15:05:29,844 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Completed request [DROP: wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z]

2016-08-18T15:05:29,844 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - zNode[/druid/loadQueue/XXX.XXX.XXX.XXX:8085/wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z] was removed

2016-08-18T15:05:59,843 INFO [ZkCoordinator-Exec–0] io.druid.server.coordination.ServerManager - Attempting to close segment wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z

2016-08-18T15:05:59,843 INFO [ZkCoordinator-Exec–0] io.druid.segment.ReferenceCountingSegment - Closing wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z

2016-08-18T15:05:59,843 INFO [ZkCoordinator-Exec–0] io.druid.segment.ReferenceCountingSegment - Closing wikidata_2016-08-10T17:02:00.000+02:00_2016-08-10T17:03:00.000+02:00_2016-08-10T15:02:00.000Z, numReferences: 0

This happens now over and over again. It loads, and deletes data.

Is this an expected behaviour? Or is that something about my configuration? If so, which part should I post here for you to have a look.

Thanks a lot for your help.

Best regards

Roman

Hi Roman, are the load/drop rules you have set up?

Hi,

I don’t think I have any rules that should be related to that problem:

Okay, after switching from 0.8.1-rc1 to 0.9.1.1 for the coordinator, this problem does not occure anymore.

Guys I have this problem on 0.9.1.1 …

I loaded segments by the insert-segments tool to Mysql, started one or two historical nodes and it always immediately drops freshly loaded segments…

There is only one default rule : load forever … changing _default_tier from 2 to 1 doesn’t affect this…

Any idea how to fix it ?

Are you sure the segments have used marked as ‘1’ in your segment metadata storage?