Realtime restart successfully but the RealtimeStopManager failed.

We restart the realtime nodes in production this morning. After sending ‘stop real time’ , the realtime node stop successfully. But when start realtime again, there is error popup ‘port is already in use’.
And we checked that the stop listener port 9991 is not opened. I think that caused by my quick operation without process checking.

Is there any way to invoke the RealtimeStopManager to startListen or is there other way to safe close realtime?

Best

“Port already in use” usually means you tried to kill a process, but it didn’t actually die. Druid nodes will start doing a clean shutdown when killed, but this may take some time. Generally, if you wait for this to finish (wait for the pid to disappear on its own) then you’ll get a good clean shutdown.

yes, i am aware of that. i am concerning that how to safe close the realtime next time?

在 2015年9月25日星期五 UTC+8上午9:34:40,Gian Merlino写道:

Does the realtime node not exit cleanly if you kill it with sigterm? It should. If it does, I would just recommend doing that in the future.

if i just kill the process, there will be ‘v8-tmp’ folder in basePersist path. And it will failed to reload these data when we restart realtime.

在 2015年9月25日星期五 UTC+8上午9:49:43,Gian Merlino写道:

Ah, I see. That happens if you shutdown the node while it’s persisting. That will be fixed in 0.8.2 (the fix is already in master: https://github.com/druid-io/druid/pull/1747).

I think out way to workaround: shall i stop the consuming from kafka, and wait for some span time (util the last intermediate persist finished.). After that restart the realtime node.
Does it make sense?

在 2015年9月25日星期五 UTC+8上午10:13:31,Gian Merlino写道:

That should work, if you have the ability to stop kafka from delivering data.

Another thing you can do is when you’re about to start up a realtime node, just delete any on-disk persists that didn’t complete. This is how the fix in 0.8.2 works. It’s ok because the partially persisted data hasn’t actually been committed yet, so it will be re-read when the node starts back up.

We use the version of 0.6.121 with the kafka offset commit automatically every 10s.
I will try the way just mentioned.

Thank you Gian!

在 2015年9月25日星期五 UTC+8上午11:45:40,Gian Merlino写道: