Coordinator loosing connection with Historical nodes

Hi,

I have a druid cluster with 6 historical nodes and coordinator in HA mode.

When we are starting the cluster all historical nodes are displayed on coordinator console but after some time ( not fixed ) some historical not visible in coordinator console.

even though historical process is up.If we restart historical process its starts coming on coordinator console.

Do you see any errors in the historical node logs ?
Fwiw, the historical nodes announce itself in the zookeeper and coordinator listens to the zookeeper annoucements to get a view of the available historicals.

I guess it could be an issue between the zk and historical connection or some other exception at the historical end which might be causing this.

A thing to look would be for long and frequent GC pauses. I have seen that causing connections drops to zookeeper server, causing coordinator to loose track of the historical.

Do you see any errors in the historical node logs ?
Fwiw, the historical nodes announce itself in the zookeeper and coordinator listens to the zookeeper annoucements to get a view of the available historicals.

I guess it could be an issue between the zk and historical connection or some other exception at the historical end which might be causing this.

Hi,

There are no frequent GC pauses in the logs.

But I saw logs where Historical loses connection with zookeeper.