Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Failover Query

Hi,

Could someone advise the usual failover time for a cluster.

We have 3 nodes at present (testing) 1 Management only, 2 with OSD replicated. These 2 nodes have ISCSI, CIFS and NFS.

If I pull the power on one of the nodes the following happens:

The dashboard shows a health warning about node 1 (management) having slow OPS
IPs for CIFS failover to Node 3
IPs for NFS do not failover
IPs for ISCSI do not failover
Trying to get a status for ISCSI or NFS when one node off leads to a gateway timeout.

As soon as the power is restored to the node that is off, dashboard picks up the dropped OSD, adjusts for a few seconds then before a blink everything is back to normal.

Any advice would be appreciated.

Adrian

The default pool replica size is 3 and min size is 2, below the min size your pool will not be active.

If you have 2 storage nodes only, you can set your pool size to 2 and min size to 1 to be active if only 1 node is up, although this is not recommended at all as it could lead to data integrity issues from split/brain..etc. If you value your data, use size 3 pool ( default ) and have 3 nodes,

Thanks Admin for the repsonse.

I did remove the default pool and recreated with 2 nodes with 1 up. So this should have already been ok.

We are working with some different hardware to create a 3 node cluster at present, but I wanted to test the failover on the services while I'm waiting for some new parts to be delivered.

 

how many OSDs you have per storage host ? type of disks + ram

when the cluster status and error message if any ?