All the Server in the cluster stopped suddenly

Dear friends in Druid community:

I am deploying cluster mode in three nodes . I stared up master server in mrs01 node, stated up data server in mrs02 node, and started up broker server in mrs03 node.

The picture followed is the property of the druid cluster.

Then I was ingesting streaming data in druid . Everything seemed to be going well. I can also query the datasource I am ingesting .

But a few minutes later , I encountered a problem . All the processes in the nodes was stopped . I did not know how it could happen. Then I opened up all the log in druid server . But I did not find the solution.Here I attach all the related logs.

Then I access the log of the machine:/var/log/messages and /var/log/secure ,here I attach the machine logs.

The time point I operated is from Oct 14 18:44:36 to Oct 14 18:48:07.Then you can find the corresponding point in time in the messages and secure log and other druid’s logs.

Anyone who can help me?

coordinator-overlord.log (497 KB)

broker.log (105 KB) (4.72 KB)

historical.log (120 KB)

messages (23.7 KB)

middleManager.log (89.2 KB)

router.log (104 KB)

secure (8.14 KB)

Hey Michael,

I’m not sure what’s going on, but it seems like something is terminating your processes after a while. Sometimes this happens if you launch them in a terminal in the background and then log out. You might want to try launching them with nohup, in that case.

Hi,Gian Merlino:

You are right. I indeed lauched them with nohup. Is it would be happening if i use the nohup?

Gian Merlino 于2019年10月15日周二 下午2:07写道:

Oh, well, I would think nohup would be the fix actually. There might be something weird going on in your environment that is killing your processes. It isn’t normal for them to exit for no reason like that. Maybe you have something stopping the container they’re running in?

These three machine are new. We did not configure any other software or environment. I did not know what on earth stop my processes. It was very strange. 这些

Gian Merlino 于2019年10月15日周二 下午2:37写道:

I meaned that I was steal using the nohup to run druid . But the processes would be shou down without any reason. 我是说

scoffi Michaeal 于2019年10月15日周二 下午2:41写道:

I used the screen to run druid.IT seems resolve this problem.

scoffi Michaeal 于2019年10月15日周二 下午2:43写道:

Glad to know that Scoffi .

Thanks ,


Thank you

Vaibhav Vaibhav 于2019年10月15日周二 下午4:28写道: