could not add all osd
BonsaiJoe
53 Posts
February 1, 2018, 8:12 amQuote from BonsaiJoe on February 1, 2018, 8:12 amHi,
after building our first cluster with 4 nodes each node with 20x1,8TB sas and 4x 400GB SSD we can only see the half of the osd used by ceph
on 2 nodes each only 1 osd is used if we try to add the osd manually the system shows "adding" and after a couple of secs adding is gone but osd is not added to the cluster
also we deleted the partition table of the osd we try to add cause of the they was a part of a zfs cluster before but still it's not possible to add the osd
in petasan.log we did not find any error:
01/02/2018 06:55:41 INFO Start add osd job for disk sdb.
01/02/2018 06:55:42 INFO Start cleaning disks
01/02/2018 06:55:43 INFO Starting ceph-disk zap /dev/sdb
01/02/2018 06:55:45 INFO Auto select journal for disk sdb.
01/02/2018 06:55:48 INFO User selected auto journal and the selected journal is /dev/sdu disk for disk sdb.
osd tree shows only1 osd on node 1 and node 2
root@ps-cl01-node01:~# ceph osd tree --cluster ps-cl01
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 68.73706 root default
-2 32.73193 host ps-cl01-node03
0 1.63660 osd.0 up 1.00000 1.00000
1 1.63660 osd.1 up 1.00000 1.00000
2 1.63660 osd.2 up 1.00000 1.00000
3 1.63660 osd.3 up 1.00000 1.00000
4 1.63660 osd.4 up 1.00000 1.00000
5 1.63660 osd.5 up 1.00000 1.00000
6 1.63660 osd.6 up 1.00000 1.00000
7 1.63660 osd.7 up 1.00000 1.00000
8 1.63660 osd.8 up 1.00000 1.00000
9 1.63660 osd.9 up 1.00000 1.00000
10 1.63660 osd.10 up 1.00000 1.00000
11 1.63660 osd.11 up 1.00000 1.00000
12 1.63660 osd.12 up 1.00000 1.00000
13 1.63660 osd.13 up 1.00000 1.00000
14 1.63660 osd.14 up 1.00000 1.00000
15 1.63660 osd.15 up 1.00000 1.00000
16 1.63660 osd.16 up 1.00000 1.00000
17 1.63660 osd.17 up 1.00000 1.00000
18 1.63660 osd.18 up 1.00000 1.00000
19 1.63660 osd.19 up 1.00000 1.00000
-3 1.63660 host ps-cl01-node01
20 1.63660 osd.20 up 1.00000 1.00000
-4 1.63660 host ps-cl01-node02
21 1.63660 osd.21 up 1.00000 1.00000
-5 32.73193 host ps-cl01-node04
22 1.63660 osd.22 up 1.00000 1.00000
23 1.63660 osd.23 up 1.00000 1.00000
24 1.63660 osd.24 up 1.00000 1.00000
25 1.63660 osd.25 up 1.00000 1.00000
26 1.63660 osd.26 up 1.00000 1.00000
27 1.63660 osd.27 up 1.00000 1.00000
28 1.63660 osd.28 up 1.00000 1.00000
29 1.63660 osd.29 up 1.00000 1.00000
30 1.63660 osd.30 up 1.00000 1.00000
31 1.63660 osd.31 up 1.00000 1.00000
32 1.63660 osd.32 up 1.00000 1.00000
33 1.63660 osd.33 up 1.00000 1.00000
34 1.63660 osd.34 up 1.00000 1.00000
35 1.63660 osd.35 up 1.00000 1.00000
36 1.63660 osd.36 up 1.00000 1.00000
37 1.63660 osd.37 up 1.00000 1.00000
38 1.63660 osd.38 up 1.00000 1.00000
39 1.63660 osd.39 up 1.00000 1.00000
40 1.63660 osd.40 up 1.00000 1.00000
41 1.63660 osd.41 up 1.00000 1.00000
detect-disks.sh on node1
root@ps-cl01-node01:~# /opt/petasan/scripts/detect-disks.sh
device=sda,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb69c
device=sdb,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bad70
device=sdc,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9e94
device=sdd,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9d58
device=sde,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb158
device=sdf,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb7b0
device=sdg,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb078
device=sdh,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bae8c
device=sdi,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bacf4
device=sdj,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5ba140
device=sdk,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb284
device=sdl,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5baafc
device=sdm,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb034
device=sdn,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9d38
device=sdo,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb190
device=sdp,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b88ac
device=sdq,size=293046768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BB150G7,serial=BTDV73260A58150MGN
device=sdr,size=293046768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BB150G7,serial=BTDV73350A70150MGN
device=sds,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340A7N400NGN
device=sdt,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb290
device=sdu,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV7334077H400NGN
device=sdv,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9cbc
device=sdw,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340GJM400NGN
device=sdx,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb684
device=sdy,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340GHZ400NGN
device=sdz,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb15c
root@ps-cl01-node01:~# ceph-disk list
/dev/sda :
/dev/sda1 ceph data, active, cluster ps-cl01, osd.20, journal /dev/sds1
/dev/sdb :
/dev/sdb1 other
/dev/sdc other, unknown
/dev/sdd :
/dev/sdd1 other
/dev/sde other, unknown
/dev/sdf other, unknown
/dev/sdg other, unknown
/dev/sdh other, unknown
/dev/sdi other, unknown
/dev/sdj other, unknown
/dev/sdk other, unknown
/dev/sdl other, unknown
/dev/sdm other, unknown
/dev/sdn other, unknown
/dev/sdo other, unknown
/dev/sdp other, unknown
/dev/sdq :
/dev/sdq2 other, ext4, mounted on /
/dev/sdq1 other, ext4, mounted on /boot
/dev/sdq4 other, ext4, mounted on /opt/petasan/config
/dev/sdq3 other, ext4, mounted on /var/lib/ceph
/dev/sdr other, unknown
/dev/sds :
/dev/sds1 ceph journal, for /dev/sda1
/dev/sdt other, unknown
/dev/sdu :
/dev/sdu1 ceph journal
/dev/sdv other, unknown
/dev/sdw :
/dev/sdw1 ceph journal
/dev/sdx other, unknown
/dev/sdy :
/dev/sdy1 ceph journal
/dev/sdz other, unknown
root@ps-cl01-node01:~# ^C
root@ps-cl01-node01:~#
Hi,
after building our first cluster with 4 nodes each node with 20x1,8TB sas and 4x 400GB SSD we can only see the half of the osd used by ceph
on 2 nodes each only 1 osd is used if we try to add the osd manually the system shows "adding" and after a couple of secs adding is gone but osd is not added to the cluster
also we deleted the partition table of the osd we try to add cause of the they was a part of a zfs cluster before but still it's not possible to add the osd
in petasan.log we did not find any error:
01/02/2018 06:55:41 INFO Start add osd job for disk sdb.
01/02/2018 06:55:42 INFO Start cleaning disks
01/02/2018 06:55:43 INFO Starting ceph-disk zap /dev/sdb
01/02/2018 06:55:45 INFO Auto select journal for disk sdb.
01/02/2018 06:55:48 INFO User selected auto journal and the selected journal is /dev/sdu disk for disk sdb.
osd tree shows only1 osd on node 1 and node 2
root@ps-cl01-node01:~# ceph osd tree --cluster ps-cl01
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 68.73706 root default
-2 32.73193 host ps-cl01-node03
0 1.63660 osd.0 up 1.00000 1.00000
1 1.63660 osd.1 up 1.00000 1.00000
2 1.63660 osd.2 up 1.00000 1.00000
3 1.63660 osd.3 up 1.00000 1.00000
4 1.63660 osd.4 up 1.00000 1.00000
5 1.63660 osd.5 up 1.00000 1.00000
6 1.63660 osd.6 up 1.00000 1.00000
7 1.63660 osd.7 up 1.00000 1.00000
8 1.63660 osd.8 up 1.00000 1.00000
9 1.63660 osd.9 up 1.00000 1.00000
10 1.63660 osd.10 up 1.00000 1.00000
11 1.63660 osd.11 up 1.00000 1.00000
12 1.63660 osd.12 up 1.00000 1.00000
13 1.63660 osd.13 up 1.00000 1.00000
14 1.63660 osd.14 up 1.00000 1.00000
15 1.63660 osd.15 up 1.00000 1.00000
16 1.63660 osd.16 up 1.00000 1.00000
17 1.63660 osd.17 up 1.00000 1.00000
18 1.63660 osd.18 up 1.00000 1.00000
19 1.63660 osd.19 up 1.00000 1.00000
-3 1.63660 host ps-cl01-node01
20 1.63660 osd.20 up 1.00000 1.00000
-4 1.63660 host ps-cl01-node02
21 1.63660 osd.21 up 1.00000 1.00000
-5 32.73193 host ps-cl01-node04
22 1.63660 osd.22 up 1.00000 1.00000
23 1.63660 osd.23 up 1.00000 1.00000
24 1.63660 osd.24 up 1.00000 1.00000
25 1.63660 osd.25 up 1.00000 1.00000
26 1.63660 osd.26 up 1.00000 1.00000
27 1.63660 osd.27 up 1.00000 1.00000
28 1.63660 osd.28 up 1.00000 1.00000
29 1.63660 osd.29 up 1.00000 1.00000
30 1.63660 osd.30 up 1.00000 1.00000
31 1.63660 osd.31 up 1.00000 1.00000
32 1.63660 osd.32 up 1.00000 1.00000
33 1.63660 osd.33 up 1.00000 1.00000
34 1.63660 osd.34 up 1.00000 1.00000
35 1.63660 osd.35 up 1.00000 1.00000
36 1.63660 osd.36 up 1.00000 1.00000
37 1.63660 osd.37 up 1.00000 1.00000
38 1.63660 osd.38 up 1.00000 1.00000
39 1.63660 osd.39 up 1.00000 1.00000
40 1.63660 osd.40 up 1.00000 1.00000
41 1.63660 osd.41 up 1.00000 1.00000
detect-disks.sh on node1
root@ps-cl01-node01:~# /opt/petasan/scripts/detect-disks.sh
device=sda,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb69c
device=sdb,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bad70
device=sdc,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9e94
device=sdd,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9d58
device=sde,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb158
device=sdf,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb7b0
device=sdg,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb078
device=sdh,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bae8c
device=sdi,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bacf4
device=sdj,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5ba140
device=sdk,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb284
device=sdl,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5baafc
device=sdm,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb034
device=sdn,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9d38
device=sdo,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb190
device=sdp,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b88ac
device=sdq,size=293046768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BB150G7,serial=BTDV73260A58150MGN
device=sdr,size=293046768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BB150G7,serial=BTDV73350A70150MGN
device=sds,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340A7N400NGN
device=sdt,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb290
device=sdu,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV7334077H400NGN
device=sdv,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9cbc
device=sdw,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340GJM400NGN
device=sdx,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb684
device=sdy,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340GHZ400NGN
device=sdz,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb15c
root@ps-cl01-node01:~# ceph-disk list
/dev/sda :
/dev/sda1 ceph data, active, cluster ps-cl01, osd.20, journal /dev/sds1
/dev/sdb :
/dev/sdb1 other
/dev/sdc other, unknown
/dev/sdd :
/dev/sdd1 other
/dev/sde other, unknown
/dev/sdf other, unknown
/dev/sdg other, unknown
/dev/sdh other, unknown
/dev/sdi other, unknown
/dev/sdj other, unknown
/dev/sdk other, unknown
/dev/sdl other, unknown
/dev/sdm other, unknown
/dev/sdn other, unknown
/dev/sdo other, unknown
/dev/sdp other, unknown
/dev/sdq :
/dev/sdq2 other, ext4, mounted on /
/dev/sdq1 other, ext4, mounted on /boot
/dev/sdq4 other, ext4, mounted on /opt/petasan/config
/dev/sdq3 other, ext4, mounted on /var/lib/ceph
/dev/sdr other, unknown
/dev/sds :
/dev/sds1 ceph journal, for /dev/sda1
/dev/sdt other, unknown
/dev/sdu :
/dev/sdu1 ceph journal
/dev/sdv other, unknown
/dev/sdw :
/dev/sdw1 ceph journal
/dev/sdx other, unknown
/dev/sdy :
/dev/sdy1 ceph journal
/dev/sdz other, unknown
root@ps-cl01-node01:~# ^C
root@ps-cl01-node01:~#
Last edited on February 1, 2018, 9:09 am by BonsaiJoe · #1
admin
2,930 Posts
February 1, 2018, 9:21 amQuote from admin on February 1, 2018, 9:21 amIt is most likely related to old zfs metada that does not cleaned by deleting partition table as per
http://tracker.ceph.com/issues/19248
this may be the fix
http://www.petasan.org/forums/?view=thread&id=152&part=2#postid-805
It is most likely related to old zfs metada that does not cleaned by deleting partition table as per
http://tracker.ceph.com/issues/19248
this may be the fix
http://www.petasan.org/forums/?view=thread&id=152&part=2#postid-805
BonsaiJoe
53 Posts
February 1, 2018, 10:08 amQuote from BonsaiJoe on February 1, 2018, 10:08 amperfect thanks now it looks good
but unfortunately, I added one wrong osd how can I remove this osdd from the cluster (its a smal ssd and should be used in future for boot system redundancy) ?
perfect thanks now it looks good
but unfortunately, I added one wrong osd how can I remove this osdd from the cluster (its a smal ssd and should be used in future for boot system redundancy) ?
admin
2,930 Posts
February 1, 2018, 10:50 amQuote from admin on February 1, 2018, 10:50 amHappy it looks good, it is one those tough issues to troubleshoot since ceph-disk does not return any errors.
For the other issue, if i understand correctly you added an extra OSD that you want removed, the OSD was added successfully and is up. If so, in PetaSAN we allow deletion of OSD from ui only if they are down, but we do not allow stopping it from ui, so you need to stop it manually:
systemctl stop ceph-osd@X
where X is the osd number
Soon after the ui will show it is down and allow you to delete
Happy it looks good, it is one those tough issues to troubleshoot since ceph-disk does not return any errors.
For the other issue, if i understand correctly you added an extra OSD that you want removed, the OSD was added successfully and is up. If so, in PetaSAN we allow deletion of OSD from ui only if they are down, but we do not allow stopping it from ui, so you need to stop it manually:
systemctl stop ceph-osd@X
where X is the osd number
Soon after the ui will show it is down and allow you to delete
BonsaiJoe
53 Posts
February 1, 2018, 2:25 pmQuote from BonsaiJoe on February 1, 2018, 2:25 pmthanks for your fast respond ....yes it was strange cause of there was no error message
now everything looks good thanks again for your help
thanks for your fast respond ....yes it was strange cause of there was no error message
now everything looks good thanks again for your help
could not add all osd
BonsaiJoe
53 Posts
Quote from BonsaiJoe on February 1, 2018, 8:12 amHi,
after building our first cluster with 4 nodes each node with 20x1,8TB sas and 4x 400GB SSD we can only see the half of the osd used by ceph
on 2 nodes each only 1 osd is used if we try to add the osd manually the system shows "adding" and after a couple of secs adding is gone but osd is not added to the cluster
also we deleted the partition table of the osd we try to add cause of the they was a part of a zfs cluster before but still it's not possible to add the osd
in petasan.log we did not find any error:
01/02/2018 06:55:41 INFO Start add osd job for disk sdb.
01/02/2018 06:55:42 INFO Start cleaning disks
01/02/2018 06:55:43 INFO Starting ceph-disk zap /dev/sdb
01/02/2018 06:55:45 INFO Auto select journal for disk sdb.
01/02/2018 06:55:48 INFO User selected auto journal and the selected journal is /dev/sdu disk for disk sdb.osd tree shows only1 osd on node 1 and node 2
root@ps-cl01-node01:~# ceph osd tree --cluster ps-cl01
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 68.73706 root default
-2 32.73193 host ps-cl01-node03
0 1.63660 osd.0 up 1.00000 1.00000
1 1.63660 osd.1 up 1.00000 1.00000
2 1.63660 osd.2 up 1.00000 1.00000
3 1.63660 osd.3 up 1.00000 1.00000
4 1.63660 osd.4 up 1.00000 1.00000
5 1.63660 osd.5 up 1.00000 1.00000
6 1.63660 osd.6 up 1.00000 1.00000
7 1.63660 osd.7 up 1.00000 1.00000
8 1.63660 osd.8 up 1.00000 1.00000
9 1.63660 osd.9 up 1.00000 1.00000
10 1.63660 osd.10 up 1.00000 1.00000
11 1.63660 osd.11 up 1.00000 1.00000
12 1.63660 osd.12 up 1.00000 1.00000
13 1.63660 osd.13 up 1.00000 1.00000
14 1.63660 osd.14 up 1.00000 1.00000
15 1.63660 osd.15 up 1.00000 1.00000
16 1.63660 osd.16 up 1.00000 1.00000
17 1.63660 osd.17 up 1.00000 1.00000
18 1.63660 osd.18 up 1.00000 1.00000
19 1.63660 osd.19 up 1.00000 1.00000
-3 1.63660 host ps-cl01-node01
20 1.63660 osd.20 up 1.00000 1.00000
-4 1.63660 host ps-cl01-node02
21 1.63660 osd.21 up 1.00000 1.00000
-5 32.73193 host ps-cl01-node04
22 1.63660 osd.22 up 1.00000 1.00000
23 1.63660 osd.23 up 1.00000 1.00000
24 1.63660 osd.24 up 1.00000 1.00000
25 1.63660 osd.25 up 1.00000 1.00000
26 1.63660 osd.26 up 1.00000 1.00000
27 1.63660 osd.27 up 1.00000 1.00000
28 1.63660 osd.28 up 1.00000 1.00000
29 1.63660 osd.29 up 1.00000 1.00000
30 1.63660 osd.30 up 1.00000 1.00000
31 1.63660 osd.31 up 1.00000 1.00000
32 1.63660 osd.32 up 1.00000 1.00000
33 1.63660 osd.33 up 1.00000 1.00000
34 1.63660 osd.34 up 1.00000 1.00000
35 1.63660 osd.35 up 1.00000 1.00000
36 1.63660 osd.36 up 1.00000 1.00000
37 1.63660 osd.37 up 1.00000 1.00000
38 1.63660 osd.38 up 1.00000 1.00000
39 1.63660 osd.39 up 1.00000 1.00000
40 1.63660 osd.40 up 1.00000 1.00000
41 1.63660 osd.41 up 1.00000 1.00000
detect-disks.sh on node1
root@ps-cl01-node01:~# /opt/petasan/scripts/detect-disks.sh
device=sda,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb69c
device=sdb,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bad70
device=sdc,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9e94
device=sdd,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9d58
device=sde,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb158
device=sdf,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb7b0
device=sdg,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb078
device=sdh,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bae8c
device=sdi,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bacf4
device=sdj,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5ba140
device=sdk,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb284
device=sdl,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5baafc
device=sdm,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb034
device=sdn,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9d38
device=sdo,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb190
device=sdp,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b88ac
device=sdq,size=293046768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BB150G7,serial=BTDV73260A58150MGN
device=sdr,size=293046768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BB150G7,serial=BTDV73350A70150MGN
device=sds,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340A7N400NGN
device=sdt,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb290
device=sdu,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV7334077H400NGN
device=sdv,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9cbc
device=sdw,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340GJM400NGN
device=sdx,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb684
device=sdy,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340GHZ400NGN
device=sdz,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb15c
root@ps-cl01-node01:~# ceph-disk list
/dev/sda :
/dev/sda1 ceph data, active, cluster ps-cl01, osd.20, journal /dev/sds1
/dev/sdb :
/dev/sdb1 other
/dev/sdc other, unknown
/dev/sdd :
/dev/sdd1 other
/dev/sde other, unknown
/dev/sdf other, unknown
/dev/sdg other, unknown
/dev/sdh other, unknown
/dev/sdi other, unknown
/dev/sdj other, unknown
/dev/sdk other, unknown
/dev/sdl other, unknown
/dev/sdm other, unknown
/dev/sdn other, unknown
/dev/sdo other, unknown
/dev/sdp other, unknown
/dev/sdq :
/dev/sdq2 other, ext4, mounted on /
/dev/sdq1 other, ext4, mounted on /boot
/dev/sdq4 other, ext4, mounted on /opt/petasan/config
/dev/sdq3 other, ext4, mounted on /var/lib/ceph
/dev/sdr other, unknown
/dev/sds :
/dev/sds1 ceph journal, for /dev/sda1
/dev/sdt other, unknown
/dev/sdu :
/dev/sdu1 ceph journal
/dev/sdv other, unknown
/dev/sdw :
/dev/sdw1 ceph journal
/dev/sdx other, unknown
/dev/sdy :
/dev/sdy1 ceph journal
/dev/sdz other, unknown
root@ps-cl01-node01:~# ^C
root@ps-cl01-node01:~#
Hi,
after building our first cluster with 4 nodes each node with 20x1,8TB sas and 4x 400GB SSD we can only see the half of the osd used by ceph
on 2 nodes each only 1 osd is used if we try to add the osd manually the system shows "adding" and after a couple of secs adding is gone but osd is not added to the cluster
also we deleted the partition table of the osd we try to add cause of the they was a part of a zfs cluster before but still it's not possible to add the osd
in petasan.log we did not find any error:
01/02/2018 06:55:41 INFO Start add osd job for disk sdb.
01/02/2018 06:55:42 INFO Start cleaning disks
01/02/2018 06:55:43 INFO Starting ceph-disk zap /dev/sdb
01/02/2018 06:55:45 INFO Auto select journal for disk sdb.
01/02/2018 06:55:48 INFO User selected auto journal and the selected journal is /dev/sdu disk for disk sdb.
osd tree shows only1 osd on node 1 and node 2
root@ps-cl01-node01:~# ceph osd tree --cluster ps-cl01
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 68.73706 root default
-2 32.73193 host ps-cl01-node03
0 1.63660 osd.0 up 1.00000 1.00000
1 1.63660 osd.1 up 1.00000 1.00000
2 1.63660 osd.2 up 1.00000 1.00000
3 1.63660 osd.3 up 1.00000 1.00000
4 1.63660 osd.4 up 1.00000 1.00000
5 1.63660 osd.5 up 1.00000 1.00000
6 1.63660 osd.6 up 1.00000 1.00000
7 1.63660 osd.7 up 1.00000 1.00000
8 1.63660 osd.8 up 1.00000 1.00000
9 1.63660 osd.9 up 1.00000 1.00000
10 1.63660 osd.10 up 1.00000 1.00000
11 1.63660 osd.11 up 1.00000 1.00000
12 1.63660 osd.12 up 1.00000 1.00000
13 1.63660 osd.13 up 1.00000 1.00000
14 1.63660 osd.14 up 1.00000 1.00000
15 1.63660 osd.15 up 1.00000 1.00000
16 1.63660 osd.16 up 1.00000 1.00000
17 1.63660 osd.17 up 1.00000 1.00000
18 1.63660 osd.18 up 1.00000 1.00000
19 1.63660 osd.19 up 1.00000 1.00000
-3 1.63660 host ps-cl01-node01
20 1.63660 osd.20 up 1.00000 1.00000
-4 1.63660 host ps-cl01-node02
21 1.63660 osd.21 up 1.00000 1.00000
-5 32.73193 host ps-cl01-node04
22 1.63660 osd.22 up 1.00000 1.00000
23 1.63660 osd.23 up 1.00000 1.00000
24 1.63660 osd.24 up 1.00000 1.00000
25 1.63660 osd.25 up 1.00000 1.00000
26 1.63660 osd.26 up 1.00000 1.00000
27 1.63660 osd.27 up 1.00000 1.00000
28 1.63660 osd.28 up 1.00000 1.00000
29 1.63660 osd.29 up 1.00000 1.00000
30 1.63660 osd.30 up 1.00000 1.00000
31 1.63660 osd.31 up 1.00000 1.00000
32 1.63660 osd.32 up 1.00000 1.00000
33 1.63660 osd.33 up 1.00000 1.00000
34 1.63660 osd.34 up 1.00000 1.00000
35 1.63660 osd.35 up 1.00000 1.00000
36 1.63660 osd.36 up 1.00000 1.00000
37 1.63660 osd.37 up 1.00000 1.00000
38 1.63660 osd.38 up 1.00000 1.00000
39 1.63660 osd.39 up 1.00000 1.00000
40 1.63660 osd.40 up 1.00000 1.00000
41 1.63660 osd.41 up 1.00000 1.00000
detect-disks.sh on node1
root@ps-cl01-node01:~# /opt/petasan/scripts/detect-disks.sh
device=sda,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb69c
device=sdb,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bad70
device=sdc,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9e94
device=sdd,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9d58
device=sde,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb158
device=sdf,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb7b0
device=sdg,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb078
device=sdh,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bae8c
device=sdi,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bacf4
device=sdj,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5ba140
device=sdk,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb284
device=sdl,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5baafc
device=sdm,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb034
device=sdn,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9d38
device=sdo,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb190
device=sdp,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b88ac
device=sdq,size=293046768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BB150G7,serial=BTDV73260A58150MGN
device=sdr,size=293046768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BB150G7,serial=BTDV73350A70150MGN
device=sds,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340A7N400NGN
device=sdt,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb290
device=sdu,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV7334077H400NGN
device=sdv,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5b9cbc
device=sdw,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340GJM400NGN
device=sdx,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb684
device=sdy,size=781422768,bus=SATA,fixed=Yes,ssd=Yes,vendor=,model=INTEL_SSDSC2BA400G4,serial=BTHV73340GHZ400NGN
device=sdz,size=3516328368,bus=SCSI,fixed=Yes,ssd=No,vendor=HGST,model=HUC101818CS4200,serial=5000cca02c5bb15c
root@ps-cl01-node01:~# ceph-disk list
/dev/sda :
/dev/sda1 ceph data, active, cluster ps-cl01, osd.20, journal /dev/sds1
/dev/sdb :
/dev/sdb1 other
/dev/sdc other, unknown
/dev/sdd :
/dev/sdd1 other
/dev/sde other, unknown
/dev/sdf other, unknown
/dev/sdg other, unknown
/dev/sdh other, unknown
/dev/sdi other, unknown
/dev/sdj other, unknown
/dev/sdk other, unknown
/dev/sdl other, unknown
/dev/sdm other, unknown
/dev/sdn other, unknown
/dev/sdo other, unknown
/dev/sdp other, unknown
/dev/sdq :
/dev/sdq2 other, ext4, mounted on /
/dev/sdq1 other, ext4, mounted on /boot
/dev/sdq4 other, ext4, mounted on /opt/petasan/config
/dev/sdq3 other, ext4, mounted on /var/lib/ceph
/dev/sdr other, unknown
/dev/sds :
/dev/sds1 ceph journal, for /dev/sda1
/dev/sdt other, unknown
/dev/sdu :
/dev/sdu1 ceph journal
/dev/sdv other, unknown
/dev/sdw :
/dev/sdw1 ceph journal
/dev/sdx other, unknown
/dev/sdy :
/dev/sdy1 ceph journal
/dev/sdz other, unknown
root@ps-cl01-node01:~# ^C
root@ps-cl01-node01:~#
admin
2,930 Posts
Quote from admin on February 1, 2018, 9:21 amIt is most likely related to old zfs metada that does not cleaned by deleting partition table as per
http://tracker.ceph.com/issues/19248
this may be the fix
http://www.petasan.org/forums/?view=thread&id=152&part=2#postid-805
It is most likely related to old zfs metada that does not cleaned by deleting partition table as per
http://tracker.ceph.com/issues/19248
this may be the fix
http://www.petasan.org/forums/?view=thread&id=152&part=2#postid-805
BonsaiJoe
53 Posts
Quote from BonsaiJoe on February 1, 2018, 10:08 amperfect thanks now it looks good
but unfortunately, I added one wrong osd how can I remove this osdd from the cluster (its a smal ssd and should be used in future for boot system redundancy) ?
perfect thanks now it looks good
but unfortunately, I added one wrong osd how can I remove this osdd from the cluster (its a smal ssd and should be used in future for boot system redundancy) ?
admin
2,930 Posts
Quote from admin on February 1, 2018, 10:50 amHappy it looks good, it is one those tough issues to troubleshoot since ceph-disk does not return any errors.
For the other issue, if i understand correctly you added an extra OSD that you want removed, the OSD was added successfully and is up. If so, in PetaSAN we allow deletion of OSD from ui only if they are down, but we do not allow stopping it from ui, so you need to stop it manually:
systemctl stop ceph-osd@X
where X is the osd number
Soon after the ui will show it is down and allow you to delete
Happy it looks good, it is one those tough issues to troubleshoot since ceph-disk does not return any errors.
For the other issue, if i understand correctly you added an extra OSD that you want removed, the OSD was added successfully and is up. If so, in PetaSAN we allow deletion of OSD from ui only if they are down, but we do not allow stopping it from ui, so you need to stop it manually:
systemctl stop ceph-osd@X
where X is the osd number
Soon after the ui will show it is down and allow you to delete
BonsaiJoe
53 Posts
Quote from BonsaiJoe on February 1, 2018, 2:25 pmthanks for your fast respond ....yes it was strange cause of there was no error message
now everything looks good thanks again for your help
thanks for your fast respond ....yes it was strange cause of there was no error message
now everything looks good thanks again for your help