Lingering OSD
garfield659
10 Posts
August 2, 2024, 1:04 amQuote from garfield659 on August 2, 2024, 1:04 amI was setting up a new node and I added some new OSDs and was testing disk throughput via the console (I thought I was on a different machine, didn't mean to do both on the same server). When I figured out I was on the same server I deleted the OSDs via the command line and am now suck with one OSD that is lingering.
In the web interface it lists:
Name: <Blank> Size: <Blank> SSD: No Usage: OSD5 Class: auto Serial: <blank> Status: 0 Linked Devices: <Blank> OSD Usage: <Blank>
root@gs-ss1-ps5:~# ceph pg dump
version 8625
stamp 2024-08-01T20:57:24.666073-0400
last_osdmap_epoch 0
last_pg_scan 0
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG LOG_DUPS DISK_LOG STATE STATE_STAMP VERSION REPORTED UP UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP LAST_DEEP_SCRUB DEEP_SCRUB_STAMP SNAPTRIMQ_LEN LAST_SCRUB_DURATION SCRUB_SCHEDULING OBJECTS_SCRUBBED OBJECTS_TRIMMED
1.0 2 0 0 0 0 590368 0 0 289 0 289 remapped+peering 2024-08-01T20:57:20.627869-0400 102'289 107:452 [0] 0 [1,0] 1 0'0 2024-08-01T17:41:59.396136-0400 0'0 2024-08-01T17:41:59.396136-0400 0 0 periodic scrub scheduled @ 2024-08-02T23:18:45.035057+0000 0 0
1 2 0 0 0 0 590368 0 0 289 289
sum 2 0 0 0 0 590368 0 0 289 289
OSD_STAT USED AVAIL USED_RAW TOTAL HB_PEERS PG_SUM PRIMARY_PG_SUM
4 291 MiB 5.8 TiB 291 MiB 5.8 TiB [] 0 0
5 0 B 0 B 0 B 0 B [] 0 0
3 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,1,2] 0 0
2 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,1,3] 0 0
1 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,2,3] 1 0
0 292 MiB 5.8 TiB 292 MiB 5.8 TiB [1,2,3] 1 1
sum 1.4 GiB 29 TiB 1.4 GiB 29 TiB
* NOTE: Omap statistics are gathered during deep scrub and may be inaccurate soon afterwards depending on utilization. See http://docs.ceph.com/en/latest/dev/placement-group/#omap-statistics for further details.
dumped all
root@gs-ss1-ps5:~#
any idea how to remove it? I tried "ceph osd destroy 5 --yes-i-really-mean-it" which reported "destroyed osd.5 however it still remains. This is PetaSAN version 3.3.0.
I was setting up a new node and I added some new OSDs and was testing disk throughput via the console (I thought I was on a different machine, didn't mean to do both on the same server). When I figured out I was on the same server I deleted the OSDs via the command line and am now suck with one OSD that is lingering.
In the web interface it lists:
Name: <Blank> Size: <Blank> SSD: No Usage: OSD5 Class: auto Serial: <blank> Status: 0 Linked Devices: <Blank> OSD Usage: <Blank>
root@gs-ss1-ps5:~# ceph pg dump
version 8625
stamp 2024-08-01T20:57:24.666073-0400
last_osdmap_epoch 0
last_pg_scan 0
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG LOG_DUPS DISK_LOG STATE STATE_STAMP VERSION REPORTED UP UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP LAST_DEEP_SCRUB DEEP_SCRUB_STAMP SNAPTRIMQ_LEN LAST_SCRUB_DURATION SCRUB_SCHEDULING OBJECTS_SCRUBBED OBJECTS_TRIMMED
1.0 2 0 0 0 0 590368 0 0 289 0 289 remapped+peering 2024-08-01T20:57:20.627869-0400 102'289 107:452 [0] 0 [1,0] 1 0'0 2024-08-01T17:41:59.396136-0400 0'0 2024-08-01T17:41:59.396136-0400 0 0 periodic scrub scheduled @ 2024-08-02T23:18:45.035057+0000 0 0
1 2 0 0 0 0 590368 0 0 289 289
sum 2 0 0 0 0 590368 0 0 289 289
OSD_STAT USED AVAIL USED_RAW TOTAL HB_PEERS PG_SUM PRIMARY_PG_SUM
4 291 MiB 5.8 TiB 291 MiB 5.8 TiB [] 0 0
5 0 B 0 B 0 B 0 B [] 0 0
3 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,1,2] 0 0
2 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,1,3] 0 0
1 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,2,3] 1 0
0 292 MiB 5.8 TiB 292 MiB 5.8 TiB [1,2,3] 1 1
sum 1.4 GiB 29 TiB 1.4 GiB 29 TiB
* NOTE: Omap statistics are gathered during deep scrub and may be inaccurate soon afterwards depending on utilization. See http://docs.ceph.com/en/latest/dev/placement-group/#omap-statistics for further details.
dumped all
root@gs-ss1-ps5:~#
any idea how to remove it? I tried "ceph osd destroy 5 --yes-i-really-mean-it" which reported "destroyed osd.5 however it still remains. This is PetaSAN version 3.3.0.
admin
2,930 Posts
August 2, 2024, 4:04 amQuote from admin on August 2, 2024, 4:04 amOn node with issue, what is output of
ceph-volume lvm list
ceph osd tree
On node with issue, what is output of
ceph-volume lvm list
ceph osd tree
garfield659
10 Posts
August 2, 2024, 10:38 pmQuote from garfield659 on August 2, 2024, 10:38 pmOSD 5 is the one that seems stuck and cannot be removed. I have also tried rebooting the entire system (all nodes), but still have the stuck OSD5.
root@gs-ss1-ps5:~# ceph-volume lvm list
====== osd.4 =======
[block] /dev/ceph-c047897b-a3c7-45b6-932e-e35ce45f91ca/osd-block-5e4f6434-e4ed-43b3-9469-45acdf049533
block device /dev/ceph-c047897b-a3c7-45b6-932e-e35ce45f91ca/osd-block-5e4f6434-e4ed-43b3-9469-45acdf049533
block uuid CNtWcb-I0vt-7uSk-b22C-u0mh-yhdH-nHqY85
cephx lockbox secret
cluster fsid b1042b3d-e000-4a7c-9b6e-c03a8a8ea373
cluster name ceph
crush device class
encrypted 0
osd fsid 5e4f6434-e4ed-43b3-9469-45acdf049533
osd id 4
osdspec affinity
type block
vdo 0
devices /dev/nvme0n1p1
root@gs-ss1-ps5:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 34.93140 root default
-3 23.28760 host gs-ss1-ps4
0 ssd 5.82190 osd.0 up 1.00000 1.00000
1 ssd 5.82190 osd.1 up 1.00000 1.00000
2 ssd 5.82190 osd.2 up 1.00000 1.00000
3 ssd 5.82190 osd.3 up 1.00000 1.00000
-5 11.64380 host gs-ss1-ps5
4 ssd 5.82190 osd.4 up 1.00000 1.00000
5 ssd 5.82190 osd.5 destroyed 0 1.00000
root@gs-ss1-ps5:~#
OSD 5 is the one that seems stuck and cannot be removed. I have also tried rebooting the entire system (all nodes), but still have the stuck OSD5.
root@gs-ss1-ps5:~# ceph-volume lvm list
====== osd.4 =======
[block] /dev/ceph-c047897b-a3c7-45b6-932e-e35ce45f91ca/osd-block-5e4f6434-e4ed-43b3-9469-45acdf049533
block device /dev/ceph-c047897b-a3c7-45b6-932e-e35ce45f91ca/osd-block-5e4f6434-e4ed-43b3-9469-45acdf049533
block uuid CNtWcb-I0vt-7uSk-b22C-u0mh-yhdH-nHqY85
cephx lockbox secret
cluster fsid b1042b3d-e000-4a7c-9b6e-c03a8a8ea373
cluster name ceph
crush device class
encrypted 0
osd fsid 5e4f6434-e4ed-43b3-9469-45acdf049533
osd id 4
osdspec affinity
type block
vdo 0
devices /dev/nvme0n1p1
root@gs-ss1-ps5:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 34.93140 root default
-3 23.28760 host gs-ss1-ps4
0 ssd 5.82190 osd.0 up 1.00000 1.00000
1 ssd 5.82190 osd.1 up 1.00000 1.00000
2 ssd 5.82190 osd.2 up 1.00000 1.00000
3 ssd 5.82190 osd.3 up 1.00000 1.00000
-5 11.64380 host gs-ss1-ps5
4 ssd 5.82190 osd.4 up 1.00000 1.00000
5 ssd 5.82190 osd.5 destroyed 0 1.00000
root@gs-ss1-ps5:~#
admin
2,930 Posts
August 3, 2024, 2:54 pmQuote from admin on August 3, 2024, 2:54 pmceph osd rm osd.5
ceph osd crush remove osd.5
ceph auth del osd.5
ceph osd rm osd.5
ceph osd crush remove osd.5
ceph auth del osd.5
garfield659
10 Posts
August 3, 2024, 3:34 pmQuote from garfield659 on August 3, 2024, 3:34 pmThat did it. Thank you very much.
That did it. Thank you very much.
Lingering OSD
garfield659
10 Posts
Quote from garfield659 on August 2, 2024, 1:04 amI was setting up a new node and I added some new OSDs and was testing disk throughput via the console (I thought I was on a different machine, didn't mean to do both on the same server). When I figured out I was on the same server I deleted the OSDs via the command line and am now suck with one OSD that is lingering.
In the web interface it lists:
Name: <Blank> Size: <Blank> SSD: No Usage: OSD5 Class: auto Serial: <blank> Status: 0 Linked Devices: <Blank> OSD Usage: <Blank>
root@gs-ss1-ps5:~# ceph pg dump
version 8625
stamp 2024-08-01T20:57:24.666073-0400
last_osdmap_epoch 0
last_pg_scan 0
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG LOG_DUPS DISK_LOG STATE STATE_STAMP VERSION REPORTED UP UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP LAST_DEEP_SCRUB DEEP_SCRUB_STAMP SNAPTRIMQ_LEN LAST_SCRUB_DURATION SCRUB_SCHEDULING OBJECTS_SCRUBBED OBJECTS_TRIMMED
1.0 2 0 0 0 0 590368 0 0 289 0 289 remapped+peering 2024-08-01T20:57:20.627869-0400 102'289 107:452 [0] 0 [1,0] 1 0'0 2024-08-01T17:41:59.396136-0400 0'0 2024-08-01T17:41:59.396136-0400 0 0 periodic scrub scheduled @ 2024-08-02T23:18:45.035057+0000 0 01 2 0 0 0 0 590368 0 0 289 289
sum 2 0 0 0 0 590368 0 0 289 289
OSD_STAT USED AVAIL USED_RAW TOTAL HB_PEERS PG_SUM PRIMARY_PG_SUM
4 291 MiB 5.8 TiB 291 MiB 5.8 TiB [] 0 0
5 0 B 0 B 0 B 0 B [] 0 0
3 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,1,2] 0 0
2 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,1,3] 0 0
1 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,2,3] 1 0
0 292 MiB 5.8 TiB 292 MiB 5.8 TiB [1,2,3] 1 1
sum 1.4 GiB 29 TiB 1.4 GiB 29 TiB* NOTE: Omap statistics are gathered during deep scrub and may be inaccurate soon afterwards depending on utilization. See http://docs.ceph.com/en/latest/dev/placement-group/#omap-statistics for further details.
dumped all
root@gs-ss1-ps5:~#
any idea how to remove it? I tried "ceph osd destroy 5 --yes-i-really-mean-it" which reported "destroyed osd.5 however it still remains. This is PetaSAN version 3.3.0.
I was setting up a new node and I added some new OSDs and was testing disk throughput via the console (I thought I was on a different machine, didn't mean to do both on the same server). When I figured out I was on the same server I deleted the OSDs via the command line and am now suck with one OSD that is lingering.
In the web interface it lists:
Name: <Blank> Size: <Blank> SSD: No Usage: OSD5 Class: auto Serial: <blank> Status: 0 Linked Devices: <Blank> OSD Usage: <Blank>
root@gs-ss1-ps5:~# ceph pg dump
version 8625
stamp 2024-08-01T20:57:24.666073-0400
last_osdmap_epoch 0
last_pg_scan 0
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG LOG_DUPS DISK_LOG STATE STATE_STAMP VERSION REPORTED UP UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP LAST_DEEP_SCRUB DEEP_SCRUB_STAMP SNAPTRIMQ_LEN LAST_SCRUB_DURATION SCRUB_SCHEDULING OBJECTS_SCRUBBED OBJECTS_TRIMMED
1.0 2 0 0 0 0 590368 0 0 289 0 289 remapped+peering 2024-08-01T20:57:20.627869-0400 102'289 107:452 [0] 0 [1,0] 1 0'0 2024-08-01T17:41:59.396136-0400 0'0 2024-08-01T17:41:59.396136-0400 0 0 periodic scrub scheduled @ 2024-08-02T23:18:45.035057+0000 0 0
1 2 0 0 0 0 590368 0 0 289 289
sum 2 0 0 0 0 590368 0 0 289 289
OSD_STAT USED AVAIL USED_RAW TOTAL HB_PEERS PG_SUM PRIMARY_PG_SUM
4 291 MiB 5.8 TiB 291 MiB 5.8 TiB [] 0 0
5 0 B 0 B 0 B 0 B [] 0 0
3 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,1,2] 0 0
2 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,1,3] 0 0
1 292 MiB 5.8 TiB 292 MiB 5.8 TiB [0,2,3] 1 0
0 292 MiB 5.8 TiB 292 MiB 5.8 TiB [1,2,3] 1 1
sum 1.4 GiB 29 TiB 1.4 GiB 29 TiB
* NOTE: Omap statistics are gathered during deep scrub and may be inaccurate soon afterwards depending on utilization. See http://docs.ceph.com/en/latest/dev/placement-group/#omap-statistics for further details.
dumped all
root@gs-ss1-ps5:~#
any idea how to remove it? I tried "ceph osd destroy 5 --yes-i-really-mean-it" which reported "destroyed osd.5 however it still remains. This is PetaSAN version 3.3.0.
admin
2,930 Posts
Quote from admin on August 2, 2024, 4:04 amOn node with issue, what is output of
ceph-volume lvm list
ceph osd tree
On node with issue, what is output of
ceph-volume lvm list
ceph osd tree
garfield659
10 Posts
Quote from garfield659 on August 2, 2024, 10:38 pmOSD 5 is the one that seems stuck and cannot be removed. I have also tried rebooting the entire system (all nodes), but still have the stuck OSD5.
root@gs-ss1-ps5:~# ceph-volume lvm list
====== osd.4 =======
[block] /dev/ceph-c047897b-a3c7-45b6-932e-e35ce45f91ca/osd-block-5e4f6434-e4ed-43b3-9469-45acdf049533
block device /dev/ceph-c047897b-a3c7-45b6-932e-e35ce45f91ca/osd-block-5e4f6434-e4ed-43b3-9469-45acdf049533
block uuid CNtWcb-I0vt-7uSk-b22C-u0mh-yhdH-nHqY85
cephx lockbox secret
cluster fsid b1042b3d-e000-4a7c-9b6e-c03a8a8ea373
cluster name ceph
crush device class
encrypted 0
osd fsid 5e4f6434-e4ed-43b3-9469-45acdf049533
osd id 4
osdspec affinity
type block
vdo 0
devices /dev/nvme0n1p1
root@gs-ss1-ps5:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 34.93140 root default
-3 23.28760 host gs-ss1-ps4
0 ssd 5.82190 osd.0 up 1.00000 1.00000
1 ssd 5.82190 osd.1 up 1.00000 1.00000
2 ssd 5.82190 osd.2 up 1.00000 1.00000
3 ssd 5.82190 osd.3 up 1.00000 1.00000
-5 11.64380 host gs-ss1-ps5
4 ssd 5.82190 osd.4 up 1.00000 1.00000
5 ssd 5.82190 osd.5 destroyed 0 1.00000
root@gs-ss1-ps5:~#
OSD 5 is the one that seems stuck and cannot be removed. I have also tried rebooting the entire system (all nodes), but still have the stuck OSD5.
root@gs-ss1-ps5:~# ceph-volume lvm list
====== osd.4 =======
[block] /dev/ceph-c047897b-a3c7-45b6-932e-e35ce45f91ca/osd-block-5e4f6434-e4ed-43b3-9469-45acdf049533
block device /dev/ceph-c047897b-a3c7-45b6-932e-e35ce45f91ca/osd-block-5e4f6434-e4ed-43b3-9469-45acdf049533
block uuid CNtWcb-I0vt-7uSk-b22C-u0mh-yhdH-nHqY85
cephx lockbox secret
cluster fsid b1042b3d-e000-4a7c-9b6e-c03a8a8ea373
cluster name ceph
crush device class
encrypted 0
osd fsid 5e4f6434-e4ed-43b3-9469-45acdf049533
osd id 4
osdspec affinity
type block
vdo 0
devices /dev/nvme0n1p1
root@gs-ss1-ps5:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 34.93140 root default
-3 23.28760 host gs-ss1-ps4
0 ssd 5.82190 osd.0 up 1.00000 1.00000
1 ssd 5.82190 osd.1 up 1.00000 1.00000
2 ssd 5.82190 osd.2 up 1.00000 1.00000
3 ssd 5.82190 osd.3 up 1.00000 1.00000
-5 11.64380 host gs-ss1-ps5
4 ssd 5.82190 osd.4 up 1.00000 1.00000
5 ssd 5.82190 osd.5 destroyed 0 1.00000
root@gs-ss1-ps5:~#
admin
2,930 Posts
Quote from admin on August 3, 2024, 2:54 pmceph osd rm osd.5
ceph osd crush remove osd.5
ceph auth del osd.5
ceph osd rm osd.5
ceph osd crush remove osd.5
ceph auth del osd.5
garfield659
10 Posts
Quote from garfield659 on August 3, 2024, 3:34 pmThat did it. Thank you very much.
That did it. Thank you very much.