Adding more OSDs than configured
Pages: 1 2
Ste
125 Posts
May 3, 2018, 3:18 pmQuote from Ste on May 3, 2018, 3:18 pmHi all, I setup a 3 nodes petasan 2.0.0 cluster with a total of 13 OSDs. At initial configuration time I choose "up to 15 OSDs". Now I'm trying to add a fourth host with 5 OSD, but I doesn't work, I suppose it is because of OSD limitation. Here is the status:
root@petatest01:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.74617 root default
-5 1.36345 host petatest01
3 hdd 0.27269 osd.3 up 1.00000 1.00000
4 hdd 0.27269 osd.4 up 1.00000 1.00000
5 hdd 0.27269 osd.5 up 1.00000 1.00000
6 hdd 0.27269 osd.6 up 1.00000 1.00000
7 hdd 0.27269 osd.7 up 1.00000 1.00000
-7 1.36345 host petatest02
8 hdd 0.27269 osd.8 up 1.00000 1.00000
9 hdd 0.27269 osd.9 up 1.00000 1.00000
10 hdd 0.27269 osd.10 up 1.00000 1.00000
11 hdd 0.27269 osd.11 up 1.00000 1.00000
12 hdd 0.27269 osd.12 up 1.00000 1.00000
-3 0.20009 host petatest03
0 hdd 0.06670 osd.0 up 1.00000 1.00000
1 hdd 0.06670 osd.1 up 1.00000 1.00000
2 hdd 0.06670 osd.2 up 1.00000 1.00000
-9 1.81918 host petatest04
13 hdd 0.90959 osd.13 down 0 1.00000
14 hdd 0.90959 osd.14 down 0 1.00000
Is there a way to manually fix this, or I have to scratch everything and re-install the cluster ?
Thanks and bye, S.
Hi all, I setup a 3 nodes petasan 2.0.0 cluster with a total of 13 OSDs. At initial configuration time I choose "up to 15 OSDs". Now I'm trying to add a fourth host with 5 OSD, but I doesn't work, I suppose it is because of OSD limitation. Here is the status:
root@petatest01:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.74617 root default
-5 1.36345 host petatest01
3 hdd 0.27269 osd.3 up 1.00000 1.00000
4 hdd 0.27269 osd.4 up 1.00000 1.00000
5 hdd 0.27269 osd.5 up 1.00000 1.00000
6 hdd 0.27269 osd.6 up 1.00000 1.00000
7 hdd 0.27269 osd.7 up 1.00000 1.00000
-7 1.36345 host petatest02
8 hdd 0.27269 osd.8 up 1.00000 1.00000
9 hdd 0.27269 osd.9 up 1.00000 1.00000
10 hdd 0.27269 osd.10 up 1.00000 1.00000
11 hdd 0.27269 osd.11 up 1.00000 1.00000
12 hdd 0.27269 osd.12 up 1.00000 1.00000
-3 0.20009 host petatest03
0 hdd 0.06670 osd.0 up 1.00000 1.00000
1 hdd 0.06670 osd.1 up 1.00000 1.00000
2 hdd 0.06670 osd.2 up 1.00000 1.00000
-9 1.81918 host petatest04
13 hdd 0.90959 osd.13 down 0 1.00000
14 hdd 0.90959 osd.14 down 0 1.00000
Is there a way to manually fix this, or I have to scratch everything and re-install the cluster ?
Thanks and bye, S.
admin
2,930 Posts
May 3, 2018, 6:27 pmQuote from admin on May 3, 2018, 6:27 pmThe 15 OSDs is not a hard set limit, it is used for best tuning the ideal pg count. Things should work fine if you add more osds/nodes.
I understand when you added osds 13/14 they came up as down, is this correct ? If you reboot the node, no fix ?
If you try to start the osds manually via command line / ssh what error do you get
/usr/lib/ceph/ceph-osd-prestart.sh --cluster petasan --id 13
/usr/bin/ceph-osd -f --cluster petasan --id 13 --setuser ceph --setgroup ceph
can you see errors in
/opt/petasan/log/ceph-disk.log
/var/log/ceph/ceph-osd.13.log
The 15 OSDs is not a hard set limit, it is used for best tuning the ideal pg count. Things should work fine if you add more osds/nodes.
I understand when you added osds 13/14 they came up as down, is this correct ? If you reboot the node, no fix ?
If you try to start the osds manually via command line / ssh what error do you get
/usr/lib/ceph/ceph-osd-prestart.sh --cluster petasan --id 13
/usr/bin/ceph-osd -f --cluster petasan --id 13 --setuser ceph --setgroup ceph
can you see errors in
/opt/petasan/log/ceph-disk.log
/var/log/ceph/ceph-osd.13.log
Ste
125 Posts
May 4, 2018, 10:21 amQuote from Ste on May 4, 2018, 10:21 amYesterday I re-installed petaSAN software on host #4 (petatest04), so today I removed osd 13 and 14 and host petatest04 from cluster and started again the "join exixting cluster" procedure.
This time 3 OSDs (over a total of 5 available on the node) were succesfully added and set to up:
root@petatest01:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.97336 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.04738 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.00000
15 hdd 0.22820 osd.15 up 1.00000 1.00000
but after the 3rd OSD there was a problem, the new 3 OSDs went down and a recovery started:
13 hdd 0.90959 osd.13 down 0 1.00000
14 hdd 0.90959 osd.14 down 0 1.00000
15 hdd 0.22820 osd.15 down 0 1.00000
pg_status
After this, the new host was not reachable anymore on both management and backend IPs, after one hour the "Final deployment stage" is still running but it is obviously hang. Then I forced a power cycle, and now the 4th node is online and the 3 newly added OSD are up, but actually there are still other 2 OSDs to add. If I browse to http:/<IP>:5001 then the wizard appears, is it safe to run it again to add the other 2 OSDs ?
Thanks, S.
Yesterday I re-installed petaSAN software on host #4 (petatest04), so today I removed osd 13 and 14 and host petatest04 from cluster and started again the "join exixting cluster" procedure.
This time 3 OSDs (over a total of 5 available on the node) were succesfully added and set to up:
root@petatest01:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.97336 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.04738 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.00000
15 hdd 0.22820 osd.15 up 1.00000 1.00000
but after the 3rd OSD there was a problem, the new 3 OSDs went down and a recovery started:
13 hdd 0.90959 osd.13 down 0 1.00000
14 hdd 0.90959 osd.14 down 0 1.00000
15 hdd 0.22820 osd.15 down 0 1.00000
pg_status
After this, the new host was not reachable anymore on both management and backend IPs, after one hour the "Final deployment stage" is still running but it is obviously hang. Then I forced a power cycle, and now the 4th node is online and the 3 newly added OSD are up, but actually there are still other 2 OSDs to add. If I browse to http:/<IP>:5001 then the wizard appears, is it safe to run it again to add the other 2 OSDs ?
Thanks, S.
Last edited on May 4, 2018, 10:33 am by Ste · #3
admin
2,930 Posts
May 4, 2018, 11:08 amQuote from admin on May 4, 2018, 11:08 amCan you check you can ping from all subnets from node 4 to the cluster and vice versa.
Do you have enough RAM ?
Do you see any errors in /opt/petasan/log/PetaSAN.log ?
If you try to start a down OSD manually as per my prev post, what errors do you get on console ?
You can re run the wizard, if while this is running you also run the atop command do you see any resource issues on ram/cpu/disks ?
Can you check you can ping from all subnets from node 4 to the cluster and vice versa.
Do you have enough RAM ?
Do you see any errors in /opt/petasan/log/PetaSAN.log ?
If you try to start a down OSD manually as per my prev post, what errors do you get on console ?
You can re run the wizard, if while this is running you also run the atop command do you see any resource issues on ram/cpu/disks ?
Last edited on May 4, 2018, 11:09 am by admin · #4
Ste
125 Posts
May 4, 2018, 12:58 pmQuote from Ste on May 4, 2018, 12:58 pm
Quote from admin on May 4, 2018, 11:08 am
Can you check you can ping from all subnets from node 4 to the cluster and vice versa.
After the power cycle, all network connections are ok. All 4 nodes and 16 OSDs are up, recovery ended and all PGs are active+clean.
Do you have enough RAM ?
Node #1, #2 and #4 have 4GB, node #3 has 16GB ram. Is this enough ? Management nodes are #1, #2 and #3
Do you see any errors in /opt/petasan/log/PetaSAN.log ?
Only on node #1 now there's this recurring error (no errors on other nodes):
04/05/2018 14:47:03 ERROR Error during process.
04/05/2018 14:47:03 ERROR [Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 98, in start self.__process()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 132, in __process while self.__do_process() != True:
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 193, in __do_process self.__clean_unused_rbd_images()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 374, in __clean_unused_rbd_images rbd_images = ceph_api.get_mapped_images(pool)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 481, in get_mapped_images out ,err = cmd.exec_command("rbd --cluster {} showmapped".format(cluster_name))
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/common/cmd.py", line 39, in exec_command p = subprocess.Popen(cmd,shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
If you try to start a down OSD manually as per my prev post, what errors do you get on console ?
They started automatically at node reboot.
You can re run the wizard, if while this is running you also run the atop command do you see any resource issues on ram/cpu/disks ?
As soon as fill the "Management node IP to join" and click "next" I get an error: "Error joining node to cluster." It seems quite reasonable as the host is already in the cluster, even if not all OSDs are configured.
root@petatest02:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.97336 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.04738 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.00000
15 hdd 0.22820 osd.15 up 1.00000 1.00000
I think I'll try the manual procedure to add the remainig 2 disks.
Bye, S.
Quote from admin on May 4, 2018, 11:08 am
Can you check you can ping from all subnets from node 4 to the cluster and vice versa.
After the power cycle, all network connections are ok. All 4 nodes and 16 OSDs are up, recovery ended and all PGs are active+clean.
Do you have enough RAM ?
Node #1, #2 and #4 have 4GB, node #3 has 16GB ram. Is this enough ? Management nodes are #1, #2 and #3
Do you see any errors in /opt/petasan/log/PetaSAN.log ?
Only on node #1 now there's this recurring error (no errors on other nodes):
04/05/2018 14:47:03 ERROR Error during process.
04/05/2018 14:47:03 ERROR [Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 98, in start self.__process()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 132, in __process while self.__do_process() != True:
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 193, in __do_process self.__clean_unused_rbd_images()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 374, in __clean_unused_rbd_images rbd_images = ceph_api.get_mapped_images(pool)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 481, in get_mapped_images out ,err = cmd.exec_command("rbd --cluster {} showmapped".format(cluster_name))
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/common/cmd.py", line 39, in exec_command p = subprocess.Popen(cmd,shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
If you try to start a down OSD manually as per my prev post, what errors do you get on console ?
They started automatically at node reboot.
You can re run the wizard, if while this is running you also run the atop command do you see any resource issues on ram/cpu/disks ?
As soon as fill the "Management node IP to join" and click "next" I get an error: "Error joining node to cluster." It seems quite reasonable as the host is already in the cluster, even if not all OSDs are configured.
root@petatest02:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.97336 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.04738 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.00000
15 hdd 0.22820 osd.15 up 1.00000 1.00000
I think I'll try the manual procedure to add the remainig 2 disks.
Bye, S.
Last edited on May 4, 2018, 1:04 pm by Ste · #5
admin
2,930 Posts
May 4, 2018, 1:09 pmQuote from admin on May 4, 2018, 1:09 pm4G is not enough for 5 osds, please see the hardware recommendation guide.
It probably worked initially when you had no data, but then you tried to add node 4 with 5 osds this created some re-balance and the node could not handle it. Even if the cluster is working now it may be stressed again if a node dies ad recovery needs to happen.
4G is not enough for 5 osds, please see the hardware recommendation guide.
It probably worked initially when you had no data, but then you tried to add node 4 with 5 osds this created some re-balance and the node could not handle it. Even if the cluster is working now it may be stressed again if a node dies ad recovery needs to happen.
Ste
125 Posts
May 4, 2018, 2:13 pmQuote from Ste on May 4, 2018, 2:13 pmOk, thank you for the hint, I'll consider it for sure for the production cluster planning.
Anyway, I think now I have another issue: when the fourth osd is added networking is stopped, I suspect an IRQ sharing problem between SATA controller and the PCI slot where the 2nd network card is installed.
Ok, thank you for the hint, I'll consider it for sure for the production cluster planning.
Anyway, I think now I have another issue: when the fourth osd is added networking is stopped, I suspect an IRQ sharing problem between SATA controller and the PCI slot where the 2nd network card is installed.
admin
2,930 Posts
May 4, 2018, 3:02 pmQuote from admin on May 4, 2018, 3:02 pmFor your second issue: "networking is stopped", not sure what this means in detail but as long as your ips and network are setup correctly + you do not have severe resource issues (which you seem to have) then things should be ok.
For your second issue: "networking is stopped", not sure what this means in detail but as long as your ips and network are setup correctly + you do not have severe resource issues (which you seem to have) then things should be ok.
Ste
125 Posts
May 7, 2018, 2:59 pmQuote from Ste on May 7, 2018, 2:59 pm
Quote from admin on May 4, 2018, 3:02 pm
For your second issue: "networking is stopped", not sure what this means in detail ...
I understood where was the issue. Node #4 motherboard has 6 SATA ports, but only 4 ports are independent, port #5 and #6 share the IRQ with other devices, like the network card. This caused networking to stop when accessing disk on port #5. So I reduced the number of disks from 6 to 4 (1 for OS + 3 OSD) and now node #4 is succesfully added to the cluster. Actually the join procedure is always failing during disk preparation, so I manually deleted and re-added the 3 OSDs and the status is finally healty.
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 5.65475 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.72878 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.00000
15 hdd 0.90959 osd.15 up 1.00000 1.00000
Thanks, Ste.
Quote from admin on May 4, 2018, 3:02 pm
For your second issue: "networking is stopped", not sure what this means in detail ...
I understood where was the issue. Node #4 motherboard has 6 SATA ports, but only 4 ports are independent, port #5 and #6 share the IRQ with other devices, like the network card. This caused networking to stop when accessing disk on port #5. So I reduced the number of disks from 6 to 4 (1 for OS + 3 OSD) and now node #4 is succesfully added to the cluster. Actually the join procedure is always failing during disk preparation, so I manually deleted and re-added the 3 OSDs and the status is finally healty.
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 5.65475 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.72878 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.00000
15 hdd 0.90959 osd.15 up 1.00000 1.00000
Thanks, Ste.
Last edited on May 7, 2018, 3:05 pm by Ste · #9
Ste
125 Posts
May 7, 2018, 3:00 pmQuote from Ste on May 7, 2018, 3:00 pm( ... to be deleted ...)
( ... to be deleted ...)
Last edited on May 7, 2018, 3:04 pm by Ste · #10
Pages: 1 2
Adding more OSDs than configured
Ste
125 Posts
Quote from Ste on May 3, 2018, 3:18 pmHi all, I setup a 3 nodes petasan 2.0.0 cluster with a total of 13 OSDs. At initial configuration time I choose "up to 15 OSDs". Now I'm trying to add a fourth host with 5 OSD, but I doesn't work, I suppose it is because of OSD limitation. Here is the status:
root@petatest01:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.74617 root default
-5 1.36345 host petatest01
3 hdd 0.27269 osd.3 up 1.00000 1.00000
4 hdd 0.27269 osd.4 up 1.00000 1.00000
5 hdd 0.27269 osd.5 up 1.00000 1.00000
6 hdd 0.27269 osd.6 up 1.00000 1.00000
7 hdd 0.27269 osd.7 up 1.00000 1.00000
-7 1.36345 host petatest02
8 hdd 0.27269 osd.8 up 1.00000 1.00000
9 hdd 0.27269 osd.9 up 1.00000 1.00000
10 hdd 0.27269 osd.10 up 1.00000 1.00000
11 hdd 0.27269 osd.11 up 1.00000 1.00000
12 hdd 0.27269 osd.12 up 1.00000 1.00000
-3 0.20009 host petatest03
0 hdd 0.06670 osd.0 up 1.00000 1.00000
1 hdd 0.06670 osd.1 up 1.00000 1.00000
2 hdd 0.06670 osd.2 up 1.00000 1.00000
-9 1.81918 host petatest04
13 hdd 0.90959 osd.13 down 0 1.00000
14 hdd 0.90959 osd.14 down 0 1.00000Is there a way to manually fix this, or I have to scratch everything and re-install the cluster ?
Thanks and bye, S.
Hi all, I setup a 3 nodes petasan 2.0.0 cluster with a total of 13 OSDs. At initial configuration time I choose "up to 15 OSDs". Now I'm trying to add a fourth host with 5 OSD, but I doesn't work, I suppose it is because of OSD limitation. Here is the status:
root@petatest01:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.74617 root default
-5 1.36345 host petatest01
3 hdd 0.27269 osd.3 up 1.00000 1.00000
4 hdd 0.27269 osd.4 up 1.00000 1.00000
5 hdd 0.27269 osd.5 up 1.00000 1.00000
6 hdd 0.27269 osd.6 up 1.00000 1.00000
7 hdd 0.27269 osd.7 up 1.00000 1.00000
-7 1.36345 host petatest02
8 hdd 0.27269 osd.8 up 1.00000 1.00000
9 hdd 0.27269 osd.9 up 1.00000 1.00000
10 hdd 0.27269 osd.10 up 1.00000 1.00000
11 hdd 0.27269 osd.11 up 1.00000 1.00000
12 hdd 0.27269 osd.12 up 1.00000 1.00000
-3 0.20009 host petatest03
0 hdd 0.06670 osd.0 up 1.00000 1.00000
1 hdd 0.06670 osd.1 up 1.00000 1.00000
2 hdd 0.06670 osd.2 up 1.00000 1.00000
-9 1.81918 host petatest04
13 hdd 0.90959 osd.13 down 0 1.00000
14 hdd 0.90959 osd.14 down 0 1.00000
Is there a way to manually fix this, or I have to scratch everything and re-install the cluster ?
Thanks and bye, S.
admin
2,930 Posts
Quote from admin on May 3, 2018, 6:27 pmThe 15 OSDs is not a hard set limit, it is used for best tuning the ideal pg count. Things should work fine if you add more osds/nodes.
I understand when you added osds 13/14 they came up as down, is this correct ? If you reboot the node, no fix ?
If you try to start the osds manually via command line / ssh what error do you get
/usr/lib/ceph/ceph-osd-prestart.sh --cluster petasan --id 13
/usr/bin/ceph-osd -f --cluster petasan --id 13 --setuser ceph --setgroup cephcan you see errors in
/opt/petasan/log/ceph-disk.log
/var/log/ceph/ceph-osd.13.log
The 15 OSDs is not a hard set limit, it is used for best tuning the ideal pg count. Things should work fine if you add more osds/nodes.
I understand when you added osds 13/14 they came up as down, is this correct ? If you reboot the node, no fix ?
If you try to start the osds manually via command line / ssh what error do you get
/usr/lib/ceph/ceph-osd-prestart.sh --cluster petasan --id 13
/usr/bin/ceph-osd -f --cluster petasan --id 13 --setuser ceph --setgroup ceph
can you see errors in
/opt/petasan/log/ceph-disk.log
/var/log/ceph/ceph-osd.13.log
Ste
125 Posts
Quote from Ste on May 4, 2018, 10:21 amYesterday I re-installed petaSAN software on host #4 (petatest04), so today I removed osd 13 and 14 and host petatest04 from cluster and started again the "join exixting cluster" procedure.
This time 3 OSDs (over a total of 5 available on the node) were succesfully added and set to up:
root@petatest01:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.97336 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.04738 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.00000
15 hdd 0.22820 osd.15 up 1.00000 1.00000but after the 3rd OSD there was a problem, the new 3 OSDs went down and a recovery started:
13 hdd 0.90959 osd.13 down 0 1.00000
14 hdd 0.90959 osd.14 down 0 1.00000
15 hdd 0.22820 osd.15 down 0 1.00000
pg_status
After this, the new host was not reachable anymore on both management and backend IPs, after one hour the "Final deployment stage" is still running but it is obviously hang. Then I forced a power cycle, and now the 4th node is online and the 3 newly added OSD are up, but actually there are still other 2 OSDs to add. If I browse to http:/<IP>:5001 then the wizard appears, is it safe to run it again to add the other 2 OSDs ?
Thanks, S.
Yesterday I re-installed petaSAN software on host #4 (petatest04), so today I removed osd 13 and 14 and host petatest04 from cluster and started again the "join exixting cluster" procedure.
This time 3 OSDs (over a total of 5 available on the node) were succesfully added and set to up:
root@petatest01:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.97336 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.04738 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.00000
15 hdd 0.22820 osd.15 up 1.00000 1.00000
but after the 3rd OSD there was a problem, the new 3 OSDs went down and a recovery started:
13 hdd 0.90959 osd.13 down 0 1.00000
14 hdd 0.90959 osd.14 down 0 1.00000
15 hdd 0.22820 osd.15 down 0 1.00000
pg_status
After this, the new host was not reachable anymore on both management and backend IPs, after one hour the "Final deployment stage" is still running but it is obviously hang. Then I forced a power cycle, and now the 4th node is online and the 3 newly added OSD are up, but actually there are still other 2 OSDs to add. If I browse to http:/<IP>:5001 then the wizard appears, is it safe to run it again to add the other 2 OSDs ?
Thanks, S.
admin
2,930 Posts
Quote from admin on May 4, 2018, 11:08 amCan you check you can ping from all subnets from node 4 to the cluster and vice versa.
Do you have enough RAM ?
Do you see any errors in /opt/petasan/log/PetaSAN.log ?
If you try to start a down OSD manually as per my prev post, what errors do you get on console ?
You can re run the wizard, if while this is running you also run the atop command do you see any resource issues on ram/cpu/disks ?
Can you check you can ping from all subnets from node 4 to the cluster and vice versa.
Do you have enough RAM ?
Do you see any errors in /opt/petasan/log/PetaSAN.log ?
If you try to start a down OSD manually as per my prev post, what errors do you get on console ?
You can re run the wizard, if while this is running you also run the atop command do you see any resource issues on ram/cpu/disks ?
Ste
125 Posts
Quote from Ste on May 4, 2018, 12:58 pm
Quote from admin on May 4, 2018, 11:08 amCan you check you can ping from all subnets from node 4 to the cluster and vice versa.
After the power cycle, all network connections are ok. All 4 nodes and 16 OSDs are up, recovery ended and all PGs are active+clean.
Do you have enough RAM ?
Node #1, #2 and #4 have 4GB, node #3 has 16GB ram. Is this enough ? Management nodes are #1, #2 and #3
Do you see any errors in /opt/petasan/log/PetaSAN.log ?
Only on node #1 now there's this recurring error (no errors on other nodes):
04/05/2018 14:47:03 ERROR Error during process.
04/05/2018 14:47:03 ERROR [Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 98, in start self.__process()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 132, in __process while self.__do_process() != True:
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 193, in __do_process self.__clean_unused_rbd_images()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 374, in __clean_unused_rbd_images rbd_images = ceph_api.get_mapped_images(pool)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 481, in get_mapped_images out ,err = cmd.exec_command("rbd --cluster {} showmapped".format(cluster_name))
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/common/cmd.py", line 39, in exec_command p = subprocess.Popen(cmd,shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
If you try to start a down OSD manually as per my prev post, what errors do you get on console ?
They started automatically at node reboot.
You can re run the wizard, if while this is running you also run the atop command do you see any resource issues on ram/cpu/disks ?
As soon as fill the "Management node IP to join" and click "next" I get an error: "Error joining node to cluster." It seems quite reasonable as the host is already in the cluster, even if not all OSDs are configured.
root@petatest02:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.97336 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.04738 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.0000015 hdd 0.22820 osd.15 up 1.00000 1.00000
I think I'll try the manual procedure to add the remainig 2 disks.
Bye, S.
Quote from admin on May 4, 2018, 11:08 amCan you check you can ping from all subnets from node 4 to the cluster and vice versa.
After the power cycle, all network connections are ok. All 4 nodes and 16 OSDs are up, recovery ended and all PGs are active+clean.
Do you have enough RAM ?
Node #1, #2 and #4 have 4GB, node #3 has 16GB ram. Is this enough ? Management nodes are #1, #2 and #3
Do you see any errors in /opt/petasan/log/PetaSAN.log ?
Only on node #1 now there's this recurring error (no errors on other nodes):
04/05/2018 14:47:03 ERROR Error during process.
04/05/2018 14:47:03 ERROR [Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 98, in start self.__process()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 132, in __process while self.__do_process() != True:
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 193, in __do_process self.__clean_unused_rbd_images()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 374, in __clean_unused_rbd_images rbd_images = ceph_api.get_mapped_images(pool)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 481, in get_mapped_images out ,err = cmd.exec_command("rbd --cluster {} showmapped".format(cluster_name))
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/common/cmd.py", line 39, in exec_command p = subprocess.Popen(cmd,shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
If you try to start a down OSD manually as per my prev post, what errors do you get on console ?
They started automatically at node reboot.
You can re run the wizard, if while this is running you also run the atop command do you see any resource issues on ram/cpu/disks ?
As soon as fill the "Management node IP to join" and click "next" I get an error: "Error joining node to cluster." It seems quite reasonable as the host is already in the cluster, even if not all OSDs are configured.
root@petatest02:~# ceph osd tree --cluster=petasan
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.97336 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.04738 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.0000015 hdd 0.22820 osd.15 up 1.00000 1.00000
I think I'll try the manual procedure to add the remainig 2 disks.
Bye, S.
admin
2,930 Posts
Quote from admin on May 4, 2018, 1:09 pm4G is not enough for 5 osds, please see the hardware recommendation guide.
It probably worked initially when you had no data, but then you tried to add node 4 with 5 osds this created some re-balance and the node could not handle it. Even if the cluster is working now it may be stressed again if a node dies ad recovery needs to happen.
4G is not enough for 5 osds, please see the hardware recommendation guide.
It probably worked initially when you had no data, but then you tried to add node 4 with 5 osds this created some re-balance and the node could not handle it. Even if the cluster is working now it may be stressed again if a node dies ad recovery needs to happen.
Ste
125 Posts
Quote from Ste on May 4, 2018, 2:13 pmOk, thank you for the hint, I'll consider it for sure for the production cluster planning.
Anyway, I think now I have another issue: when the fourth osd is added networking is stopped, I suspect an IRQ sharing problem between SATA controller and the PCI slot where the 2nd network card is installed.
Ok, thank you for the hint, I'll consider it for sure for the production cluster planning.
Anyway, I think now I have another issue: when the fourth osd is added networking is stopped, I suspect an IRQ sharing problem between SATA controller and the PCI slot where the 2nd network card is installed.
admin
2,930 Posts
Quote from admin on May 4, 2018, 3:02 pmFor your second issue: "networking is stopped", not sure what this means in detail but as long as your ips and network are setup correctly + you do not have severe resource issues (which you seem to have) then things should be ok.
For your second issue: "networking is stopped", not sure what this means in detail but as long as your ips and network are setup correctly + you do not have severe resource issues (which you seem to have) then things should be ok.
Ste
125 Posts
Quote from Ste on May 7, 2018, 2:59 pmQuote from admin on May 4, 2018, 3:02 pmFor your second issue: "networking is stopped", not sure what this means in detail ...
I understood where was the issue. Node #4 motherboard has 6 SATA ports, but only 4 ports are independent, port #5 and #6 share the IRQ with other devices, like the network card. This caused networking to stop when accessing disk on port #5. So I reduced the number of disks from 6 to 4 (1 for OS + 3 OSD) and now node #4 is succesfully added to the cluster. Actually the join procedure is always failing during disk preparation, so I manually deleted and re-added the 3 OSDs and the status is finally healty.
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 5.65475 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.72878 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.00000
15 hdd 0.90959 osd.15 up 1.00000 1.00000Thanks, Ste.
Quote from admin on May 4, 2018, 3:02 pmFor your second issue: "networking is stopped", not sure what this means in detail ...
I understood where was the issue. Node #4 motherboard has 6 SATA ports, but only 4 ports are independent, port #5 and #6 share the IRQ with other devices, like the network card. This caused networking to stop when accessing disk on port #5. So I reduced the number of disks from 6 to 4 (1 for OS + 3 OSD) and now node #4 is succesfully added to the cluster. Actually the join procedure is always failing during disk preparation, so I manually deleted and re-added the 3 OSDs and the status is finally healty.
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 5.65475 root default
-5 1.36299 host petatest01
3 hdd 0.27299 osd.3 up 1.00000 1.00000
4 hdd 0.27299 osd.4 up 1.00000 1.00000
5 hdd 0.27299 osd.5 up 1.00000 1.00000
6 hdd 0.27299 osd.6 up 1.00000 1.00000
7 hdd 0.27299 osd.7 up 1.00000 1.00000
-7 1.36299 host petatest02
8 hdd 0.27299 osd.8 up 1.00000 1.00000
9 hdd 0.27299 osd.9 up 1.00000 1.00000
10 hdd 0.27299 osd.10 up 1.00000 1.00000
11 hdd 0.27299 osd.11 up 1.00000 1.00000
12 hdd 0.27299 osd.12 up 1.00000 1.00000
-3 0.20000 host petatest03
0 hdd 0.06699 osd.0 up 1.00000 1.00000
1 hdd 0.06699 osd.1 up 1.00000 1.00000
2 hdd 0.06699 osd.2 up 1.00000 1.00000
-9 2.72878 host petatest04
13 hdd 0.90959 osd.13 up 1.00000 1.00000
14 hdd 0.90959 osd.14 up 1.00000 1.00000
15 hdd 0.90959 osd.15 up 1.00000 1.00000
Thanks, Ste.
Ste
125 Posts
Quote from Ste on May 7, 2018, 3:00 pm( ... to be deleted ...)
( ... to be deleted ...)