Unknown exception : No known server

Relates to Apache Druid 0.21.1

Hi,

We have some issues with our druid console :
Unknown exception / …common.IOE : No known server exception

In middlemanager logs, we have :
WARN [WorkerTaskManager-CompletedTasksCleaner] org.apache.druid.indexing.worker.WorkerTaskManager - Exception while getting active tasks from overlord. will retry on next scheduled run.
org.apache.druid.java.util.common.IOE: No known server

On overlord logs :
org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisor - Failed to get task runner because I’m not the leader!

On zookeeper :
2021-12-16 09:37:09,027 [myid:1] - WARN [NIOWorkerThread-2:NIOServerCnxn@366] - Unable to read additional data from client sessionid 0x1016c1d6ca8000c, likely client has closed socket
2021-12-16 09:37:37,619 [myid:1] - INFO [SessionTracker:ZooKeeperServer@413] - Expiring session 0x1016c1d6ca8000c, timeout of 30000ms exceeded

I don’t if all these errors are linked and how can I investigate more to find root cause(s) and solve issue(s) ?

Regards

Hi Alaure,

Is this occurring during start up? Almost seems like not all the processes have completed their startup procedure.

Was it working and now it isn’t? If so, what changed?

Can you tell us a bit more about your cluster setup?

Thanks for your answer and sorry for my late one.
Finally, we solve the issue : there were issues on ingestion due to a out of memory.
We increase some middle managers buffer limits and the situation has been stabilized.

Regards

Alaure