Historical nodes are fail to connect zookeeper

I tried druid cluster setup with three nodes it is working really awesome

druid version is 0.12.3

Node1 zookeeper, coordinator and overload(8gb ram)

node2 broker(8gb ram)

node 3 historical, middle-managers and tranquility 8.2 (12gb ram , 310GB hardisk)

but all of sudden my historical nodes are fail to connect zookeeper, i restarted historical node and whole computer many times but no chance to work it’s always shows connection establish fail

and im storing druid data in local derby(total size of segments is 12GB)

i removed the segments-cache and tried still the same issue if it is really memory issue when i remove segments-cache it should work right?

below the error is produced by only historical nodes

org.apache.zookeeper.ClientCnxn - Opening socket connection to server 192.168.207.39/192.168.207.39:2181. Will not attempt to authenticate using SASL (unknown error)
2019-06-04T05:18:57,288 INFO [main-SendThread(192.168.207.39:2181)] org.apache.zookeeper.ClientCnxn - Socket connection established to 192.168.207.39/192.168.207.39:2181, initiating session
2019-06-04T05:18:57,290 INFO [main-SendThread(192.168.207.39:2181)] org.apache.zookeeper.ClientCnxn - Session establishment complete on server 192.168.207.39/192.168.207.39:2181, sessionid = 0x16b1d6aae8d0005, negotiated timeout = 30000
2019-06-04T05:18:57,291 INFO [main-EventThread] org.apache.curator.framework.state.ConnectionStateManager - State change: RECONNECTED
2019-06-04T05:19:17,299 WARN [main-SendThread(192.168.207.39:2181)] org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 20002ms for sessionid 0x16b1d6aae8d0005
2019-06-04T05:19:17,299 INFO [main-SendThread(192.168.207.39:2181)] org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 20002ms for sessionid 0x16b1d6aae8d0005, closing socket connection and attempting reconnect
2019-06-04T05:19:17,402 INFO [main-EventThread] org.apache.curator.framework.state.ConnectionStateManager - State change: SUSPENDED
2019-06-04T05:19:18,886 INFO [main-SendThread(192.168.207.39:2181)] org.apache.zookeeper.ClientCnxn - Opening socket connection to server 192.168.207.39/192.168.207.39:2181. Will not attempt to authenticate using SASL (unknown error)
2019-06-04T05:19:18,888 INFO [main-SendThread(192.168.207.39:2181)] org.apache.zookeeper.ClientCnxn - Socket connection established to 192.168.207.39/192.168.207.39:2181, initiating session
2019-06-04T05:19:18,890 INFO [main-SendThread(192.168.207.39:2181)] org.apache.zookeeper.ClientCnxn - Session establishment complete on server 192.168.207.39/192.168.207.39:2181, sessionid = 0x16b1d6aae8d0005, negotiated timeout = 30000
2019-06-04T05:19:18,890 INFO [main-EventThread] org.apache.curator.framework.state.ConnectionStateManager - State change: RECONNECTED

Can some give solution to solve this issue we are debugging from past two days, Thank you

Hi Sai,
Do you see errors in the zookeeper logs on 192.168.207.39?

Do you see any issues in coordinator logs?

What is your ZK quorum?

In general, Derby is not suitable for multi-node cluster. I think you will be better with mysql as your metadata

as you are running multi-node druid cluster.

Thanks,

–siva