Forums - PetaSAN

ForumGeneral DiscussionReplace management node when 2 of …
You need to log in to create posts and topics. Login · Register
Replace management node when 2 of 3 node down

Pages: 1 2

phanvlong
9 Posts

September 22, 2021, 5:52 pm
Quote from phanvlong on September 22, 2021, 5:52 pm
Hi All,

I have 1 cluster with 3 node management, but 2 of them die after accidentally power lost. Now, I install new node but can't Replace Management Node, after a while i got error message :

Alert!
Cannot perform replace, ceph monitor status is not healthy.

How can I restore cluster now

Thank for your help

Best regards

Phan Long

Hi All,

I have 1 cluster with 3 node management, but 2 of them die after accidentally power lost. Now, I install new node but can't Replace Management Node, after a while i got error message :

Alert!
Cannot perform replace, ceph monitor status is not healthy.

How can I restore cluster now

Thank for your help

Best regards

Phan Long

#1

Shiori
86 Posts

December 17, 2021, 1:27 pm
Quote from Shiori on December 17, 2021, 1:27 pm
you must get 2 of three management nodes up to replace a node. sorry but no other option.

you must get 2 of three management nodes up to replace a node. sorry but no other option.

#2

atselitan
21 Posts

February 16, 2023, 4:53 am
Quote from atselitan on February 16, 2023, 4:53 am
Hello.

I have the same problem - lost 2 of 3 monitors.

When i try to replace one of them i recieve message: "Cannot replace management node, ceph monitors are not in quorum."

When i try to execute "ceph -s" command - console is stuck.. Cluster does not work.

What can i do in this situation? Really no way to fix it?

Hello.

I have the same problem - lost 2 of 3 monitors.

When i try to replace one of them i recieve message: "Cannot replace management node, ceph monitors are not in quorum."

When i try to execute "ceph -s" command - console is stuck.. Cluster does not work.

What can i do in this situation? Really no way to fix it?

Last edited on February 16, 2023, 5:06 am by atselitan · #3

atselitan
21 Posts

February 16, 2023, 8:01 am
Quote from atselitan on February 16, 2023, 8:01 am
Hello again!

I was able to start ceph on one of two failed management nodes. But Petasan services does not work.

I have 3 management nodes:
sds-osd-101 - 10.1.19.181
sds-osd-121 - 10.1.19.186
sds-osd-141 - 10.1.19.191

I try to replace sds-osd-121.
When i input Management node IP: 10.1.19.181 (sds-osd-101) and cluster password and then click "Next", i recieve Alert "Error connecting to management node."
When i input Management node IP: 10.1.19.191 (sds-osd-141), i can click Next and i see network information . But after click Next again i recieve Alert "Error deploying node. Could not connect to 10.1.19.181"

Is there a way to replace Management node with only one another Management node accessible?

The history of the disaster:
1. A system hard drive failed in the server sds-osd-121.
2. I replaced the hard drive and tried to perform Replace Node operation. To do this, I had to go to the address http://10.1.19.186:5001/, but I made a mistake and went to http://10.1.19.181:5001/ and tried to connect to 10.1.19.181 Management node IP (to itself) and recieved a mistake.
3. Then i correct the address to http://10.1.19.186:5001/ but it does not help to finish Node Replacement. See the alert text above.
4. After that i rebooted 10.1.19.181 (sds-osd-101) hoping that it will help. But after reboot network and services was not start. It seems that i corrupted 10.1.19.181 system during step 2.
5. I was able to start network and ceph services on 10.1.19.181 (sds-osd-101) manually. But it seems that the running Petasan services is need to perform Node Replacement. But i don't know how to start Petasan services correctly by hands.

Please, help me to finish node sds-osd-121 replacement if it possile.

PetaSan version is 3.0.1

####################################

Update:
I'm trying to restart petasan-start-services.service on sds-osd-101 (10.1.19.181) and have result below:
root@sds-osd-101:~# systemctl restart petasan-start-services.service
root@sds-osd-101:~# systemctl status petasan-start-services.service
● petasan-start-services.service - PetaSAN Start Services
Loaded: loaded (/lib/systemd/system/petasan-start-services.service; enabled; vendor preset: enabled)
Active: active (exited) since Thu 2023-02-16 16:52:44 +07; 15s ago
Process: 12247 ExecStart=/opt/petasan/scripts/start_petasan_services.py (code=exited, status=0/SUCCESS)
Main PID: 12247 (code=exited, status=0/SUCCESS)

Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:   File "/usr/lib/python3.8/json/init.py", line 293, in load
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:     return loads(fp.read(),
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:   File "/usr/lib/python3.8/json/init.py", line 357, in loads
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:     return _default_decoder.decode(s)
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:   File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:   File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:     obj, end = self.scan_once(s, idx)
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]: json.decoder.JSONDecodeError: Expecting ',' delimiter: line 72 column 1 (char 2018)
Feb 16 16:52:44 sds-osd-101 systemd[1]: Finished PetaSAN Start Services.
root@sds-osd-101:~#

Hello again!

I was able to start ceph on one of two failed management nodes. But Petasan services does not work.

I have 3 management nodes:
sds-osd-101 - 10.1.19.181
sds-osd-121 - 10.1.19.186
sds-osd-141 - 10.1.19.191

I try to replace sds-osd-121.
When i input Management node IP: 10.1.19.181 (sds-osd-101) and cluster password and then click "Next", i recieve Alert "Error connecting to management node."
When i input Management node IP: 10.1.19.191 (sds-osd-141), i can click Next and i see network information . But after click Next again i recieve Alert "Error deploying node. Could not connect to 10.1.19.181"

Is there a way to replace Management node with only one another Management node accessible?

The history of the disaster:
1. A system hard drive failed in the server sds-osd-121.
2. I replaced the hard drive and tried to perform Replace Node operation. To do this, I had to go to the address http://10.1.19.186:5001/, but I made a mistake and went to http://10.1.19.181:5001/ and tried to connect to 10.1.19.181 Management node IP (to itself) and recieved a mistake.
3. Then i correct the address to http://10.1.19.186:5001/ but it does not help to finish Node Replacement. See the alert text above.
4. After that i rebooted 10.1.19.181 (sds-osd-101) hoping that it will help. But after reboot network and services was not start. It seems that i corrupted 10.1.19.181 system during step 2.
5. I was able to start network and ceph services on 10.1.19.181 (sds-osd-101) manually. But it seems that the running Petasan services is need to perform Node Replacement. But i don't know how to start Petasan services correctly by hands.

Please, help me to finish node sds-osd-121 replacement if it possile.

PetaSan version is 3.0.1

####################################

Update:
I'm trying to restart petasan-start-services.service on sds-osd-101 (10.1.19.181) and have result below:
root@sds-osd-101:~# systemctl restart petasan-start-services.service
root@sds-osd-101:~# systemctl status petasan-start-services.service
● petasan-start-services.service - PetaSAN Start Services
Loaded: loaded (/lib/systemd/system/petasan-start-services.service; enabled; vendor preset: enabled)
Active: active (exited) since Thu 2023-02-16 16:52:44 +07; 15s ago
Process: 12247 ExecStart=/opt/petasan/scripts/start_petasan_services.py (code=exited, status=0/SUCCESS)
Main PID: 12247 (code=exited, status=0/SUCCESS)

Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:   File "/usr/lib/python3.8/json/init.py", line 293, in load
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:     return loads(fp.read(),
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:   File "/usr/lib/python3.8/json/init.py", line 357, in loads
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:     return _default_decoder.decode(s)
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:   File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:   File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]:     obj, end = self.scan_once(s, idx)
Feb 16 16:52:44 sds-osd-101 start_petasan_services.py[12250]: json.decoder.JSONDecodeError: Expecting ',' delimiter: line 72 column 1 (char 2018)
Feb 16 16:52:44 sds-osd-101 systemd[1]: Finished PetaSAN Start Services.
root@sds-osd-101:~#

Last edited on February 16, 2023, 9:57 am by atselitan · #4

admin
2,961 Posts

February 16, 2023, 10:59 am
Quote from admin on February 16, 2023, 10:59 am
The issue is if you have only 1 node up out of 3, you do not have a cluster up: underlying systems used by PetaSAN mainly ceph, consul and gluster are distributed and need a quorum to be up.

It is possible to recreate the cluster from 1 node but it is not straightforward, and requires a lot of manual steps and configuration. When you say you were able to bring ceph up on second node with only one node up, do you mean the ceph services like mons and osds are up or do they actually see each other as 1 cluster and have a successful status as a cluster ?

To be able to bring the system yourself, i recommend you look at the docs for ceph, consul and gluster and look at how you can restore an existing cluster starting from only 1 node up. I recommend you start from the ceph first then consul then gluster.

You can also consider getting support from us for things like this.

The issue is if you have only 1 node up out of 3, you do not have a cluster up: underlying systems used by PetaSAN mainly ceph, consul and gluster are distributed and need a quorum to be up.

It is possible to recreate the cluster from 1 node but it is not straightforward, and requires a lot of manual steps and configuration. When you say you were able to bring ceph up on second node with only one node up, do you mean the ceph services like mons and osds are up or do they actually see each other as 1 cluster and have a successful status as a cluster ?

To be able to bring the system yourself, i recommend you look at the docs for ceph, consul and gluster and look at how you can restore an existing cluster starting from only 1 node up. I recommend you start from the ceph first then consul then gluster.

You can also consider getting support from us for things like this.

#5

atselitan
21 Posts

February 16, 2023, 11:15 am
Quote from atselitan on February 16, 2023, 11:15 am

Quote from admin on February 16, 2023, 10:59 am

When you say you were able to bring ceph up on second node with only one node up, do you mean the ceph services like mons and osds are up or do they actually see each other as 1 cluster and have a successful status as a cluster ?

I have a working ceph cluster with quorum of two mons:

root@sds-osd-101:~# ceph -s
cluster:
id:     e9bd44bf-a956-4ae5-9b31-c8a48dad2f61
health: HEALTH_WARN
clock skew detected on mon.sds-osd-141
1/3 mons down, quorum sds-osd-101,sds-osd-141

services:
mon: 3 daemons, quorum sds-osd-101,sds-osd-141 (age 5h), out of quorum: sds-osd-121
mgr: sds-osd-101(active, since 4h), standbys: sds-osd-141
mds: 2 up:standby
osd: 504 osds: 467 up (since 5h), 467 in (since 8h); 853 remapped pgs

data:
pools:   3 pools, 18433 pgs
objects: 75.24M objects, 284 TiB
usage:   838 TiB used, 1.7 PiB / 2.5 PiB avail
pgs:     5688340/225725715 objects misplaced (2.520%)
17580 active+clean
612   active+remapped+backfill_wait
241   active+remapped+backfilling

io:
client:   2.9 MiB/s rd, 1.8 MiB/s wr, 15 op/s rd, 150 op/s wr
recovery: 4.8 GiB/s, 1.27k objects/s

Quote from admin on February 16, 2023, 10:59 am

When you say you were able to bring ceph up on second node with only one node up, do you mean the ceph services like mons and osds are up or do they actually see each other as 1 cluster and have a successful status as a cluster ?

I have a working ceph cluster with quorum of two mons:

root@sds-osd-101:~# ceph -s
cluster:
id:     e9bd44bf-a956-4ae5-9b31-c8a48dad2f61
health: HEALTH_WARN
clock skew detected on mon.sds-osd-141
1/3 mons down, quorum sds-osd-101,sds-osd-141

services:
mon: 3 daemons, quorum sds-osd-101,sds-osd-141 (age 5h), out of quorum: sds-osd-121
mgr: sds-osd-101(active, since 4h), standbys: sds-osd-141
mds: 2 up:standby
osd: 504 osds: 467 up (since 5h), 467 in (since 8h); 853 remapped pgs

data:
pools:   3 pools, 18433 pgs
objects: 75.24M objects, 284 TiB
usage:   838 TiB used, 1.7 PiB / 2.5 PiB avail
pgs:     5688340/225725715 objects misplaced (2.520%)
17580 active+clean
612   active+remapped+backfill_wait
241   active+remapped+backfilling

io:
client:   2.9 MiB/s rd, 1.8 MiB/s wr, 15 op/s rd, 150 op/s wr
recovery: 4.8 GiB/s, 1.27k objects/s

Last edited on February 16, 2023, 11:30 am by atselitan · #6

admin
2,961 Posts

February 16, 2023, 2:37 pm
Quote from admin on February 16, 2023, 2:37 pm
very good, the next steps as per prev. are consul (first) then gluster.

very good, the next steps as per prev. are consul (first) then gluster.

#7

atselitan
21 Posts

February 16, 2023, 4:07 pm
Quote from atselitan on February 16, 2023, 4:07 pm
Thanks for your help!

Can you tell me please, why this can happen? :
I started consul on corrupted node sds-osd-101 by command below:
consul agent -raft-protocol 2 -config-dir /opt/petasan/config/etc/consul.d/server -bind 10.1.31.11 -retry-join 10.1.31.20 -retry-join 10.1.31.16

And then my iscsi nodes start turning off by power by themselves until only one of them is left alive. If i trying to turn them on, they still continue turning off after loading the operating system except one of them. And this is not always the same node. But storage nodes all staying alive. When i stop consul on sds-osd-101 - iscsi nodes stop turning off.
I encounter a similar case during operation about once a year. But unlike this case, it is enough to turn on iscsi nodes once, after which they work for a long time.
Maybe you have come across such cases?

Thanks for your help!

Can you tell me please, why this can happen? :
I started consul on corrupted node sds-osd-101 by command below:
consul agent -raft-protocol 2 -config-dir /opt/petasan/config/etc/consul.d/server -bind 10.1.31.11 -retry-join 10.1.31.20 -retry-join 10.1.31.16

And then my iscsi nodes start turning off by power by themselves until only one of them is left alive. If i trying to turn them on, they still continue turning off after loading the operating system except one of them. And this is not always the same node. But storage nodes all staying alive. When i stop consul on sds-osd-101 - iscsi nodes stop turning off.
I encounter a similar case during operation about once a year. But unlike this case, it is enough to turn on iscsi nodes once, after which they work for a long time.
Maybe you have come across such cases?

Last edited on February 16, 2023, 4:35 pm by atselitan · #8

admin
2,961 Posts

February 16, 2023, 4:35 pm
Quote from admin on February 16, 2023, 4:35 pm
I am not sure what status the consul system is at. But if the entire cluster crashes, it is possible it has old running configuration data like iSCSI drives, a normal cluster will clean these entries buy maybe a crashed cluster will retain this old data. If so, it is possible that fencing action/killing if nodes that serves iSCSI disks but does not respond to heartbeats. If this is the case you could try to search for the configuration key for fencing and turn it off, the key is under PetaSAN root key, you can use consul kv get/set commands. Alternatively it may be easier to just let the fencing occur several times and should end once the disks are transferred to running systems. Again this is just a guess. Good luck.

I am not sure what status the consul system is at. But if the entire cluster crashes, it is possible it has old running configuration data like iSCSI drives, a normal cluster will clean these entries buy maybe a crashed cluster will retain this old data. If so, it is possible that fencing action/killing if nodes that serves iSCSI disks but does not respond to heartbeats. If this is the case you could try to search for the configuration key for fencing and turn it off, the key is under PetaSAN root key, you can use consul kv get/set commands. Alternatively it may be easier to just let the fencing occur several times and should end once the disks are transferred to running systems. Again this is just a guess. Good luck.

#9

atselitan
21 Posts

February 20, 2023, 2:31 pm
Quote from atselitan on February 20, 2023, 2:31 pm

Quote from admin on February 16, 2023, 4:35 pm

If this is the case you could try to search for the configuration key for fencing and turn it off, the key is under PetaSAN root key, you can use consul kv get/set commands.

Can not find key "fencing" in consul:
root@sds-osd-141:~# consul kv export|grep fencing
root@sds-osd-141:~#

Is this correct name of key?

Quote from admin on February 16, 2023, 4:35 pm

If this is the case you could try to search for the configuration key for fencing and turn it off, the key is under PetaSAN root key, you can use consul kv get/set commands.

Can not find key "fencing" in consul:
root@sds-osd-141:~# consul kv export|grep fencing
root@sds-osd-141:~#

Is this correct name of key?

#10

Post Reply: Replace management node when 2 of 3 node down

Cancel

Pages: 1 2