Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

2 Nodes shutting down after Network impact

Hi,

we are Testing Petasan with iscsi and s3 in strange sczenarios to go for sure we can use it for production.

at the moment our environment hast 6 storage nodes and 3 monitoring nodes.

In the last test, we blockt all the networktraffic on both switches, so no communication between all nodes was possible. We left this state for about 6 hours.

The result was that two storage nodes (2 and 3) were shutted down automatic.

We rebooted the switches, network traffic was now possible again. We powered on both offline nodes, after they where up Node 1 and 3 shutted down.

How to debug that situation ?

Ceph Heath has following warnings:


Thanks

Kind Regards

Probably the shutdown was due to fencing, you can disable fencing from maintenance page, but it is better to leave it.

Most cases, with a 2 switch setup and assume you have bonded interface for HA, you would want to test shutting 1 switch at a time to verify the network setup is highly available and the cluster keeps functioning.

Shutting both switches is essentially a cluster shutdown, all nodes lose connection to one another and you have no cluster. For nodes to re-peer, easiest thing is to restart all nodes.

Thanks for your answer.

That shouldn't be a normal situation, but we are testing things like that to see what can happen.

After serveral restarts of the offline nodes, all came back.

We did the test again and disabled fencing, and we we had no shutdown.

Thanks a lot, and have a nice day.