Druid indexing task failures when one of the HDFS worker dies

Hi,

I’m using Druid 0.11.0. After 5 days of running stable, I see a lot of indexing task failures on Druid. Seems like there are a lot of connections in the CLOSE_WAIT state on one of the HDFS worker and this worker is reported as dead on the HDFS UI (worker process is still alive inside the docker container in which it runs). In my environment, Druid is the only component that interacts with HDFS. I found this bug logged against HDFS-client. Has anyone hit this issue when using Druid along with HDFS?

Thanks,

Avinash

Does anyone know any workaround for this issue. I know this is a HDFS-client specific issue but I’m just wondering if anyone else in the community hit this issue.

Thanks,

Avinash