Historical nodes loose connection every hour

Every hour our Druid Historical nodes loose connection with Zookeepers(3 instances). In most cases, they are successfully re-connected, but sometimes the connection is lost for a long time (forever before manual intervention).
In the same VM’s with Historical nodes we have MiddleManager’s and MM’s connections are good.
Druid located in Azure in separated VM’s, Druid version is 0.13.0snapshot.

Here is a Historical log example:

2018-10-29T11:01:23,363 WARN [main-SendThread(10.0.0.5:2181)] org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 21053ms for sessionid 0x3662e89a08805d5
2018-10-29T11:01:23,471 WARN [qtp1862552664-80[select_[bf_impression_click]593cbea4-fcf9-47ef-b13d-fd3e33235836]] io.druid.server.QueryLifecycle - Exception while processing queryId [593cbea4-fcf9-47ef-b13d-fd3e33235836]
2018-10-29T12:01:23,237 WARN [main-SendThread(10.0.0.4:2181)] org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 23848ms for sessionid 0x3662e89a08805d5
2018-10-29T12:01:23,366 WARN [qtp1862552664-68[select
[bf_impression_click]_593cbea4-fcf9-47ef-b13d-fd3e33235836]] io.druid.server.QueryLifecycle - Exception while processing queryId [593cbea4-fcf9-47ef-b13d-fd3e33235836]

``

Hi, 0.13.0 hasn’t been released yet (except for release candidates) and is still considered unstable. I would recommend using the latest released version (0.12.3) available here: http://druid.io/downloads.

It might be GC related, try enabling GC logs and see if anything interesting shows up. Sometimes long GC pauses cause ZooKeeper disconnects.