Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Automatic shutdown of nodes

Testing a cluster of 3 nodes. 2 physical disks, one virtual. Observed a spontaneous shutdown of the physical node. What could be the reason ? Where to dig ? On one note (DELL R530 when loading ACPI error, the second one loads normally) But both are turned off periodically.

This is probably fencing in action, a node being shutdown by other nodes if it does not respond in time to cluster heartbeats. This is done so they can take over resources (ip, disk io) while making sure the original node is not doing any io itself.

You can turn off fencing from maintenance tab to stop this but it is not recommended.

You should find out the root cause of this, ie why the node is not responding to heartbeats, it could be connection errors on backend  interface, hardware errors, or too much load relative to your hardware.

Hi, I had a similar issue in the past, and it was caused by the backend network interface hanging from time to time, due to an IRQ conflict with the SATA controller.

Quote from Ste on May 4, 2020, 12:47 pm

Hi, I had a similar issue in the past, and it was caused by the backend network interface hanging from time to time, due to an IRQ conflict with the SATA controller.

Hi. It seems to be the same for me. And how Did you manage the conflicts ? Did you move your network card to another slot ? I poeticas BIOS but nothing for the IRQ settings have not yet found.

Hi, sorry for the late reply, but that was a faulty motheboard, so we simply replaced it.

Bye.