Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

strange shutdowns

Hi,

today morning we had strang shutdowns of 2 of our 3 petasan servers. At the time hosts turned off our network admin was pluging cables and changing some network settings (MTU,Spanning Tree,Flow Control). In journallog it seems that there first was a short outage of the backend links and afterwards the system was shut down.

My Question: Is there something in the cluster which could shutdown the servers if there is a conenction problem?

Regards,

Dennis

Yes we do simple software based fencing, when a node is not able to connect to the cluster it will clean any current resources (ips/iSCSI paths)  so they could be served by other nodes,  but also the other nodes will try to kill it before failing over these resources, this happens when the failed node exceeds its timeout in reporting its health check heartbeat to the cluster ( via Consul ) .

In the future we will allow more advanced hardware based fencing such as using STONITH/IPMI.

 

 

After rebooting a node it happens often that the other nodes shut this node down again, how to prevent this?

Regards,

Dennis

Just wait a couple of minutes before starting the machine. This is the case in fencing, the other nodes cannot be 100% sure the suspected node is now OK or it is still dying, this will go on until all the other nodes agree on distributing the failed resources among them.