Failover Query
adrianm
5 Posts
February 4, 2021, 8:24 amQuote from adrianm on February 4, 2021, 8:24 amHi,
Could someone advise the usual failover time for a cluster.
We have 3 nodes at present (testing) 1 Management only, 2 with OSD replicated. These 2 nodes have ISCSI, CIFS and NFS.
If I pull the power on one of the nodes the following happens:
The dashboard shows a health warning about node 1 (management) having slow OPS
IPs for CIFS failover to Node 3
IPs for NFS do not failover
IPs for ISCSI do not failover
Trying to get a status for ISCSI or NFS when one node off leads to a gateway timeout.
As soon as the power is restored to the node that is off, dashboard picks up the dropped OSD, adjusts for a few seconds then before a blink everything is back to normal.
Any advice would be appreciated.
Adrian
Hi,
Could someone advise the usual failover time for a cluster.
We have 3 nodes at present (testing) 1 Management only, 2 with OSD replicated. These 2 nodes have ISCSI, CIFS and NFS.
If I pull the power on one of the nodes the following happens:
The dashboard shows a health warning about node 1 (management) having slow OPS
IPs for CIFS failover to Node 3
IPs for NFS do not failover
IPs for ISCSI do not failover
Trying to get a status for ISCSI or NFS when one node off leads to a gateway timeout.
As soon as the power is restored to the node that is off, dashboard picks up the dropped OSD, adjusts for a few seconds then before a blink everything is back to normal.
Any advice would be appreciated.
Adrian
admin
2,930 Posts
February 4, 2021, 10:13 amQuote from admin on February 4, 2021, 10:13 amThe default pool replica size is 3 and min size is 2, below the min size your pool will not be active.
If you have 2 storage nodes only, you can set your pool size to 2 and min size to 1 to be active if only 1 node is up, although this is not recommended at all as it could lead to data integrity issues from split/brain..etc. If you value your data, use size 3 pool ( default ) and have 3 nodes,
The default pool replica size is 3 and min size is 2, below the min size your pool will not be active.
If you have 2 storage nodes only, you can set your pool size to 2 and min size to 1 to be active if only 1 node is up, although this is not recommended at all as it could lead to data integrity issues from split/brain..etc. If you value your data, use size 3 pool ( default ) and have 3 nodes,
adrianm
5 Posts
February 4, 2021, 10:20 amQuote from adrianm on February 4, 2021, 10:20 amThanks Admin for the repsonse.
I did remove the default pool and recreated with 2 nodes with 1 up. So this should have already been ok.
We are working with some different hardware to create a 3 node cluster at present, but I wanted to test the failover on the services while I'm waiting for some new parts to be delivered.
Thanks Admin for the repsonse.
I did remove the default pool and recreated with 2 nodes with 1 up. So this should have already been ok.
We are working with some different hardware to create a 3 node cluster at present, but I wanted to test the failover on the services while I'm waiting for some new parts to be delivered.
admin
2,930 Posts
February 4, 2021, 10:57 amQuote from admin on February 4, 2021, 10:57 amhow many OSDs you have per storage host ? type of disks + ram
when the cluster status and error message if any ?
how many OSDs you have per storage host ? type of disks + ram
when the cluster status and error message if any ?
Failover Query
adrianm
5 Posts
Quote from adrianm on February 4, 2021, 8:24 amHi,
Could someone advise the usual failover time for a cluster.
We have 3 nodes at present (testing) 1 Management only, 2 with OSD replicated. These 2 nodes have ISCSI, CIFS and NFS.
If I pull the power on one of the nodes the following happens:
The dashboard shows a health warning about node 1 (management) having slow OPS
IPs for CIFS failover to Node 3
IPs for NFS do not failover
IPs for ISCSI do not failover
Trying to get a status for ISCSI or NFS when one node off leads to a gateway timeout.As soon as the power is restored to the node that is off, dashboard picks up the dropped OSD, adjusts for a few seconds then before a blink everything is back to normal.
Any advice would be appreciated.
Adrian
Hi,
Could someone advise the usual failover time for a cluster.
We have 3 nodes at present (testing) 1 Management only, 2 with OSD replicated. These 2 nodes have ISCSI, CIFS and NFS.
If I pull the power on one of the nodes the following happens:
The dashboard shows a health warning about node 1 (management) having slow OPS
IPs for CIFS failover to Node 3
IPs for NFS do not failover
IPs for ISCSI do not failover
Trying to get a status for ISCSI or NFS when one node off leads to a gateway timeout.
As soon as the power is restored to the node that is off, dashboard picks up the dropped OSD, adjusts for a few seconds then before a blink everything is back to normal.
Any advice would be appreciated.
Adrian
admin
2,930 Posts
Quote from admin on February 4, 2021, 10:13 amThe default pool replica size is 3 and min size is 2, below the min size your pool will not be active.
If you have 2 storage nodes only, you can set your pool size to 2 and min size to 1 to be active if only 1 node is up, although this is not recommended at all as it could lead to data integrity issues from split/brain..etc. If you value your data, use size 3 pool ( default ) and have 3 nodes,
The default pool replica size is 3 and min size is 2, below the min size your pool will not be active.
If you have 2 storage nodes only, you can set your pool size to 2 and min size to 1 to be active if only 1 node is up, although this is not recommended at all as it could lead to data integrity issues from split/brain..etc. If you value your data, use size 3 pool ( default ) and have 3 nodes,
adrianm
5 Posts
Quote from adrianm on February 4, 2021, 10:20 amThanks Admin for the repsonse.
I did remove the default pool and recreated with 2 nodes with 1 up. So this should have already been ok.
We are working with some different hardware to create a 3 node cluster at present, but I wanted to test the failover on the services while I'm waiting for some new parts to be delivered.
Thanks Admin for the repsonse.
I did remove the default pool and recreated with 2 nodes with 1 up. So this should have already been ok.
We are working with some different hardware to create a 3 node cluster at present, but I wanted to test the failover on the services while I'm waiting for some new parts to be delivered.
admin
2,930 Posts
Quote from admin on February 4, 2021, 10:57 amhow many OSDs you have per storage host ? type of disks + ram
when the cluster status and error message if any ?
how many OSDs you have per storage host ? type of disks + ram
when the cluster status and error message if any ?