Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Multiple HDD OSD down after node/service restart

Pages: 1 2

Parted output looks fine.  It feels worth noting that the HDD in this setup are in an attached JBOD, while the SSD are in the server chassis.  We noticed after rebooting, the volume groups no longer map to the correct physical volumes.  Re-scanning for physical volumes doesn’t seem to help.  See output below.

root@vlab-ext-jfesx77-pvsa:~# parted /dev/sdb print
Model: ATA INTEL SSDSC2KG96 (scsi)
Disk /dev/sdb: 960GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name             Flags
1      1049kB  64.4GB  64.4GB               ceph-journal-db
2      64.4GB  129GB   64.4GB               ceph-journal-db
3      129GB   193GB   64.4GB               ceph-journal-db
4      193GB   258GB   64.4GB               ceph-journal-db
5      258GB   322GB   64.4GB               ceph-journal-db
6      322GB   387GB   64.4GB               ceph-journal-db
7      387GB   451GB   64.4GB               ceph-journal-db
8      451GB   515GB   64.4GB               ceph-journal-db
9      515GB   580GB   64.4GB               ceph-journal-db
10      580GB   644GB   64.4GB               ceph-journal-db
11      644GB   709GB   64.4GB               ceph-journal-db
12      709GB   773GB   64.4GB               ceph-journal-db

 

Before Reboot (everything looks good and maps):

root@vlab-ext-jfesx77-pvsa:/var/log/ceph# for vg in $(vgs | grep ceph | awk '{print $1}'); do ls -l /dev/$vg; done
total 0
lrwxrwxrwx 1 root root 7 Apr 27 15:51 osd-block-a954c4ec-f6d0-40f7-86d2-da9e37bb993e -> ../dm-4
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:29 osd-block-7e14dd09-f2a3-49a4-85a2-43e6823e108b -> ../dm-17
total 0
lrwxrwxrwx 1 root root 7 Apr 27 15:51 osd-block-cba6cef7-2b3c-42a0-b84e-df3c4b2a0751 -> ../dm-3
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:28 osd-block-62ce1570-e9b7-4f23-9700-d9bebf3dd9d8 -> ../dm-16
total 0
lrwxrwxrwx 1 root root 7 Apr 27 15:51 osd-block-8b744ab7-52f4-46ab-8e47-f1a3c6976114 -> ../dm-6
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:23 osd-block-b7824e0e-1e0c-4489-9fdb-d3ea1eacf9ab -> ../dm-13
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:34 osd-block-f00afa5e-385e-41f7-8a02-bcdf801be192 -> ../dm-21
total 0
lrwxrwxrwx 1 root root 7 Apr 27 15:51 osd-block-7a75dcdc-0faa-460b-b467-595a025f579e -> ../dm-5
total 0
lrwxrwxrwx 1 root root 7 Apr 27 15:51 osd-block-978909b4-c326-4cc8-8d6f-be481c29df9a -> ../dm-7
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:32 osd-block-2de65cca-a531-4e9f-99ce-7985e748b06f -> ../dm-19
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:33 osd-block-26dc2a8d-c976-479f-a636-c3840248e22f -> ../dm-20
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:30 osd-block-f2831751-f5e9-4d98-b00a-7f2e3f868f82 -> ../dm-18
total 0
lrwxrwxrwx 1 root root 7 Apr 27 15:51 osd-block-45c15cdb-d0bc-4c6d-8237-ffa36902be76 -> ../dm-0
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:26 osd-block-7495380b-e82f-4543-9ac7-5e76cda83970 -> ../dm-14
total 0
lrwxrwxrwx 1 root root 7 Apr 27 15:51 osd-block-a3af27a7-8b48-45cf-ab00-cd2e96bd2c33 -> ../dm-2
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:22 osd-block-6b3ba0ea-5851-4f46-9e54-0549c9951dbe -> ../dm-12
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:21 osd-block-4a4acd02-30e1-4c18-b96c-d442559f7418 -> ../dm-10
total 0
lrwxrwxrwx 1 root root 7 Apr 27 15:51 osd-block-6931d56e-ee06-492c-aab4-ac60e240cc4d -> ../dm-8
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:27 osd-block-0f245ad5-2e22-4daf-b44c-e472ae7ec1c5 -> ../dm-15
total 0
lrwxrwxrwx 1 root root 7 Apr 27 15:51 osd-block-ee3634fd-ad88-4fa1-9cc5-66d3fa89d029 -> ../dm-9
total 0
lrwxrwxrwx 1 ceph ceph 8 Apr 27 16:24 osd-block-b465a46c-619e-471d-9f9b-fc7e69af3d06 -> ../dm-11
total 0
lrwxrwxrwx 1 root root 7 Apr 27 15:51 osd-block-27a4410b-46fb-4253-9e3d-a65fb3f66169 -> ../dm-1

 

After Reboot (most no longer map):

root@vlab-ext-jfesx77-pvsa:~# for vg in $(vgs | grep ceph | awk '{print $1}'); do ls -l /dev/$vg; done
total 0
lrwxrwxrwx 1 root root 7 Apr 27 17:08 osd-block-a954c4ec-f6d0-40f7-86d2-da9e37bb993e -> ../dm-5
ls: cannot access '/dev/ceph-09fc42b6-17bc-4363-84e5-c58bff7cd078': No such file or directory
total 0
lrwxrwxrwx 1 root root 7 Apr 27 17:08 osd-block-cba6cef7-2b3c-42a0-b84e-df3c4b2a0751 -> ../dm-0
ls: cannot access '/dev/ceph-3ab2744c-bdbc-473f-ae62-9c7843a246e9': No such file or directory
total 0
lrwxrwxrwx 1 root root 7 Apr 27 17:08 osd-block-8b744ab7-52f4-46ab-8e47-f1a3c6976114 -> ../dm-6
ls: cannot access '/dev/ceph-59a8f04f-846f-44d3-b37e-93b041c118ac': No such file or directory
ls: cannot access '/dev/ceph-66878f06-fc95-4c84-b42d-8625725e5a9d': No such file or directory
total 0
lrwxrwxrwx 1 root root 8 Apr 27 17:08 osd-block-7a75dcdc-0faa-460b-b467-595a025f579e -> ../dm-10
total 0
lrwxrwxrwx 1 root root 7 Apr 27 17:08 osd-block-978909b4-c326-4cc8-8d6f-be481c29df9a -> ../dm-8
ls: cannot access '/dev/ceph-882556dc-baf5-4fbc-ac4d-213adda1bfa5': No such file or directory
ls: cannot access '/dev/ceph-887eaeb1-2b1e-4364-8282-ed188386ba6c': No such file or directory
ls: cannot access '/dev/ceph-8be328ec-e8fe-452d-9cb9-08774a9634cc': No such file or directory
total 0
lrwxrwxrwx 1 root root 7 Apr 27 17:08 osd-block-45c15cdb-d0bc-4c6d-8237-ffa36902be76 -> ../dm-2
ls: cannot access '/dev/ceph-b86dc808-b966-4ea3-805f-591838924cd1': No such file or directory
total 0
lrwxrwxrwx 1 root root 7 Apr 27 17:08 osd-block-a3af27a7-8b48-45cf-ab00-cd2e96bd2c33 -> ../dm-3
ls: cannot access '/dev/ceph-d71dcad6-eec0-4f52-a88c-ea8a623d38d6': No such file or directory
total 0
lrwxrwxrwx 1 root root 7 Apr 27 17:12 osd-block-4a4acd02-30e1-4c18-b96c-d442559f7418 -> ../dm-9
total 0
lrwxrwxrwx 1 root root 7 Apr 27 17:08 osd-block-6931d56e-ee06-492c-aab4-ac60e240cc4d -> ../dm-1
ls: cannot access '/dev/ceph-e6b46b10-4544-4864-a072-5f327a681402': No such file or directory
total 0
lrwxrwxrwx 1 root root 7 Apr 27 17:08 osd-block-ee3634fd-ad88-4fa1-9cc5-66d3fa89d029 -> ../dm-7
total 0
lrwxrwxrwx 1 root root 8 Apr 27 17:08 osd-block-b465a46c-619e-471d-9f9b-fc7e69af3d06 -> ../dm-11
total 0
lrwxrwxrwx 1 root root 7 Apr 27 17:08 osd-block-27a4410b-46fb-4253-9e3d-a65fb3f66169 -> ../dm-4

after reboot, are the vgs active, if not activate them

lvm vgs -o name | grep ceph  | xargs vgchange -ay

then see if you can start OSDs via

systemctl start ceph-osd@XX

this is not a fix, just to know if the volume come up active or not.

else if they do come up ok, then can you run

pvs -o pv_name,vg_name,lv_name
ceph-volume lvm list

 

No they do not come back up.

root@vlab-ext-jfesx77-pvsa:~# pvs -o pv_name,vg_name,lv_name
PV         VG                                        LV
/dev/sdd1  ceph-001d4412-6b8d-4d66-9726-2a813dfcbbe1 osd-block-e6283510-3dfd-451e-b507-0576923f8797
/dev/sde1  ceph-a4f31efe-71d8-4add-962b-dd7325f4fdfe osd-block-752fbcac-f638-4560-a1af-9ccb2c19c73c
/dev/sdf1  ceph-b91653d0-c161-448b-a6e2-962316650c54 osd-block-aa5bdf71-e94d-49f2-9e1e-33b9f95466e8
/dev/sdg1  ceph-5151d9a4-a771-470e-9f1b-3adc7b2ce655 osd-block-f1e4aed4-1e39-4dc1-ac3e-15ba8e0ba273
/dev/sdh1  ceph-a272d168-960d-4097-96e4-611b63eb6fb7 osd-block-4ca77ece-8a44-4528-9f41-3d0778694ce5
/dev/sdi1  ceph-4bfaac92-baa6-4923-a430-0bbd3a8c8cff osd-block-825c8cf5-f013-42ae-a77e-cc836c0f64e7
/dev/sdj1  ceph-ffb861fe-480b-4c8f-a11b-7b7fdc26449d osd-block-87c682f4-7d9d-450f-93a7-f8ae4c332292
/dev/sdk1  ceph-aea7dd54-717c-4b0a-b972-255de4cfc5ea osd-block-57e236ae-6733-4687-91e7-b969ff51bf91
/dev/sdl1  ceph-bcef5fb7-8c14-4e1c-b5c5-5045f4ad4f39 osd-block-1b5dc02d-1d91-4c67-ab6b-b62f375a74ac
/dev/sdm1  ceph-dc61034e-55cb-48f5-b5d8-68ed5f8412d2 osd-block-64c7f755-dbbf-491e-be1c-aa95e477bb13
/dev/sdn1  ceph-87187d4f-bb19-4ebe-9829-c9032fb8c624 osd-block-f3121f14-c71c-4f4e-9817-a867daf85268
/dev/sdo1  ceph-620076a6-01ac-44b4-86ae-c90d649dba2f osd-block-6f4575b2-5952-43d2-9917-00eeece10404
/dev/sdp1  ceph-a7cf73d8-cf1f-4ee2-9bda-5b1568cd831e osd-block-1c754e67-e826-419d-ae81-f82f20cc22e3
/dev/sdq1  ceph-bd3f925f-aebc-4e95-b9ae-375a10148898 osd-block-98e0c4bc-8db7-4fab-b2e2-b4c206c18070
/dev/sdr1  ceph-ec8f2baf-5ed3-4760-93d5-ec9d62a2919c osd-block-472f923e-ef2c-43d1-8bfd-67dfd037bee6
/dev/sds1  ceph-804b46a9-736f-4c59-9e51-268e849bc0c6 osd-block-724549fb-db8b-4a18-b29b-84f8c61bdee3
/dev/sdt1  ceph-2e5f9311-c1f3-4c77-81ae-75a94b8d5d21 osd-block-d833b309-ba8a-48cb-8ac3-09e90e4e08ec
/dev/sdu1  ceph-4b01a555-55fe-4a46-88da-c76fb2dd6f2b osd-block-d9546f83-c230-485e-930e-3d7bdbc8792e
/dev/sdv1  ceph-0f336814-847e-447f-972c-2cd8c63b0f77 osd-block-ea178336-6a4b-482b-a2d5-0fc66ef681fb
/dev/sdw1  ceph-8fede35f-e28f-4f13-8d42-1aace61d897c osd-block-97a8568c-2967-42b1-8742-30dd9cd8824c
/dev/sdx1  ceph-a517c6e0-0edc-4497-b0b3-1b7eb5a2df74 osd-block-62921d5b-c23a-4272-a53b-a7ac72ddcfb3
/dev/sdy1  ceph-39808dde-4448-4e21-9381-64386656e87c osd-block-72d74711-3256-455c-b777-2f937f278961

root@vlab-ext-jfesx77-pvsa:~# ceph-volume lvm list

====== osd.22 ======

[block]       /dev/ceph-b91653d0-c161-448b-a6e2-962316650c54/osd-block-aa5bdf71-e94d-49f2-9e1e-33b9f95466e8

block device              /dev/ceph-b91653d0-c161-448b-a6e2-962316650c54/osd-block-aa5bdf71-e94d-49f2-9e1e-33b9f95466e8
block uuid                zD5wYF-1Cds-t3hm-1CXW-FXXD-h1Ns-U26zxF
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdb1
db uuid                   078a4349-75b0-4645-9535-804ba3277abd
encrypted                 0
osd fsid                  aa5bdf71-e94d-49f2-9e1e-33b9f95466e8
osd id                    22
type                      block
vdo                       0
devices                   /dev/sdf1

[db]          /dev/sdb1

PARTUUID                  078a4349-75b0-4645-9535-804ba3277abd

====== osd.23 ======

[block]       /dev/ceph-5151d9a4-a771-470e-9f1b-3adc7b2ce655/osd-block-f1e4aed4-1e39-4dc1-ac3e-15ba8e0ba273

block device              /dev/ceph-5151d9a4-a771-470e-9f1b-3adc7b2ce655/osd-block-f1e4aed4-1e39-4dc1-ac3e-15ba8e0ba273
block uuid                VYylVT-NiAz-8OJE-nlMB-42m4-Uvkl-pxPRD1
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdc1
db uuid                   228936b5-7b31-45f1-b0c9-377a1a2b0f39
encrypted                 0
osd fsid                  f1e4aed4-1e39-4dc1-ac3e-15ba8e0ba273
osd id                    23
type                      block
vdo                       0
devices                   /dev/sdg1

[db]          /dev/sdc1

PARTUUID                  228936b5-7b31-45f1-b0c9-377a1a2b0f39

====== osd.24 ======

[block]       /dev/ceph-001d4412-6b8d-4d66-9726-2a813dfcbbe1/osd-block-e6283510-3dfd-451e-b507-0576923f8797

block device              /dev/ceph-001d4412-6b8d-4d66-9726-2a813dfcbbe1/osd-block-e6283510-3dfd-451e-b507-0576923f8797
block uuid                6Bze5i-dCsf-2gFx-FqsO-xzSZ-cDuF-9xCs4y
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdb2
db uuid                   2427ca8c-da28-4d43-acd4-b23323c4133b
encrypted                 0
osd fsid                  e6283510-3dfd-451e-b507-0576923f8797
osd id                    24
type                      block
vdo                       0
devices                   /dev/sdd1

[db]          /dev/sdb2

PARTUUID                  2427ca8c-da28-4d43-acd4-b23323c4133b

====== osd.25 ======

[block]       /dev/ceph-a4f31efe-71d8-4add-962b-dd7325f4fdfe/osd-block-752fbcac-f638-4560-a1af-9ccb2c19c73c

block device              /dev/ceph-a4f31efe-71d8-4add-962b-dd7325f4fdfe/osd-block-752fbcac-f638-4560-a1af-9ccb2c19c73c
block uuid                wtRpLx-kR2a-504i-vnw6-8qNE-zUqF-MRuQCP
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdc2
db uuid                   51939dbc-2d86-4e48-a352-8b80cd1296c7
encrypted                 0
osd fsid                  752fbcac-f638-4560-a1af-9ccb2c19c73c
osd id                    25
type                      block
vdo                       0
devices                   /dev/sde1

[db]          /dev/sdc2

PARTUUID                  51939dbc-2d86-4e48-a352-8b80cd1296c7

====== osd.26 ======

[block]       /dev/ceph-87187d4f-bb19-4ebe-9829-c9032fb8c624/osd-block-f3121f14-c71c-4f4e-9817-a867daf85268

block device              /dev/ceph-87187d4f-bb19-4ebe-9829-c9032fb8c624/osd-block-f3121f14-c71c-4f4e-9817-a867daf85268
block uuid                I4dhqa-8Yb2-jRGQ-fz7V-wpVY-GV0v-aiCPoA
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdb3
db uuid                   6b5283f1-d87a-4d6c-a768-1d7541226f46
encrypted                 0
osd fsid                  f3121f14-c71c-4f4e-9817-a867daf85268
osd id                    26
type                      block
vdo                       0
devices                   /dev/sdn1

[db]          /dev/sdb3

PARTUUID                  6b5283f1-d87a-4d6c-a768-1d7541226f46

====== osd.27 ======

[block]       /dev/ceph-620076a6-01ac-44b4-86ae-c90d649dba2f/osd-block-6f4575b2-5952-43d2-9917-00eeece10404

block device              /dev/ceph-620076a6-01ac-44b4-86ae-c90d649dba2f/osd-block-6f4575b2-5952-43d2-9917-00eeece10404
block uuid                3NF514-mq6K-NvHd-551A-grjM-wJky-nk7rNY
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdc3
db uuid                   e1242c5e-e276-4303-949d-fe2557dcc832
encrypted                 0
osd fsid                  6f4575b2-5952-43d2-9917-00eeece10404
osd id                    27
type                      block
vdo                       0
devices                   /dev/sdo1

[db]          /dev/sdc3

PARTUUID                  e1242c5e-e276-4303-949d-fe2557dcc832

====== osd.28 ======

[block]       /dev/ceph-bcef5fb7-8c14-4e1c-b5c5-5045f4ad4f39/osd-block-1b5dc02d-1d91-4c67-ab6b-b62f375a74ac

block device              /dev/ceph-bcef5fb7-8c14-4e1c-b5c5-5045f4ad4f39/osd-block-1b5dc02d-1d91-4c67-ab6b-b62f375a74ac
block uuid                gnkAE5-ljL0-k8PW-1dOe-3NYl-X1Ah-70G4Kc
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdb4
db uuid                   6c783b68-a0fa-4e4a-be9e-836228390d93
encrypted                 0
osd fsid                  1b5dc02d-1d91-4c67-ab6b-b62f375a74ac
osd id                    28
type                      block
vdo                       0
devices                   /dev/sdl1

[db]          /dev/sdb4

PARTUUID                  6c783b68-a0fa-4e4a-be9e-836228390d93

====== osd.29 ======

[block]       /dev/ceph-dc61034e-55cb-48f5-b5d8-68ed5f8412d2/osd-block-64c7f755-dbbf-491e-be1c-aa95e477bb13

block device              /dev/ceph-dc61034e-55cb-48f5-b5d8-68ed5f8412d2/osd-block-64c7f755-dbbf-491e-be1c-aa95e477bb13
block uuid                mB4rYz-fP0e-gMHv-dg3y-bYw1-jfaQ-1vOvHt
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdc4
db uuid                   9d84911c-f0c3-49ec-8002-b394cfba8ae9
encrypted                 0
osd fsid                  64c7f755-dbbf-491e-be1c-aa95e477bb13
osd id                    29
type                      block
vdo                       0
devices                   /dev/sdm1

[db]          /dev/sdc4

PARTUUID                  9d84911c-f0c3-49ec-8002-b394cfba8ae9

====== osd.30 ======

[block]       /dev/ceph-ffb861fe-480b-4c8f-a11b-7b7fdc26449d/osd-block-87c682f4-7d9d-450f-93a7-f8ae4c332292

block device              /dev/ceph-ffb861fe-480b-4c8f-a11b-7b7fdc26449d/osd-block-87c682f4-7d9d-450f-93a7-f8ae4c332292
block uuid                XPMwnR-vyx8-1DcL-uJQA-R9HU-W9Ha-eSx60A
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdb5
db uuid                   44a882be-64ee-4946-88ad-8733d8e8f169
encrypted                 0
osd fsid                  87c682f4-7d9d-450f-93a7-f8ae4c332292
osd id                    30
type                      block
vdo                       0
devices                   /dev/sdj1

[db]          /dev/sdb5

PARTUUID                  44a882be-64ee-4946-88ad-8733d8e8f169

====== osd.31 ======

[block]       /dev/ceph-aea7dd54-717c-4b0a-b972-255de4cfc5ea/osd-block-57e236ae-6733-4687-91e7-b969ff51bf91

block device              /dev/ceph-aea7dd54-717c-4b0a-b972-255de4cfc5ea/osd-block-57e236ae-6733-4687-91e7-b969ff51bf91
block uuid                898J15-9lma-JFb2-ntio-DFGF-tTZh-imQcI6
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdc5
db uuid                   508b2438-fbae-42ef-82c1-9e5c59b669b9
encrypted                 0
osd fsid                  57e236ae-6733-4687-91e7-b969ff51bf91
osd id                    31
type                      block
vdo                       0
devices                   /dev/sdk1

[db]          /dev/sdc5

PARTUUID                  508b2438-fbae-42ef-82c1-9e5c59b669b9

====== osd.32 ======

[block]       /dev/ceph-a272d168-960d-4097-96e4-611b63eb6fb7/osd-block-4ca77ece-8a44-4528-9f41-3d0778694ce5

block device              /dev/ceph-a272d168-960d-4097-96e4-611b63eb6fb7/osd-block-4ca77ece-8a44-4528-9f41-3d0778694ce5
block uuid                qeq6kP-8bZj-8GNf-rmgv-SoJ7-0pm7-6CQeL5
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdb6
db uuid                   9b9bd691-9565-4998-b86e-a323e083a585
encrypted                 0
osd fsid                  4ca77ece-8a44-4528-9f41-3d0778694ce5
osd id                    32
type                      block
vdo                       0
devices                   /dev/sdh1

[db]          /dev/sdb6

PARTUUID                  9b9bd691-9565-4998-b86e-a323e083a585

====== osd.33 ======

[block]       /dev/ceph-4bfaac92-baa6-4923-a430-0bbd3a8c8cff/osd-block-825c8cf5-f013-42ae-a77e-cc836c0f64e7

block device              /dev/ceph-4bfaac92-baa6-4923-a430-0bbd3a8c8cff/osd-block-825c8cf5-f013-42ae-a77e-cc836c0f64e7
block uuid                pJeZLX-Ca1x-k9FW-OyJs-btpx-bw6C-ekgLda
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdc6
db uuid                   59993cdb-45be-4b93-9540-b78a7f51fff8
encrypted                 0
osd fsid                  825c8cf5-f013-42ae-a77e-cc836c0f64e7
osd id                    33
type                      block
vdo                       0
devices                   /dev/sdi1

[db]          /dev/sdc6

PARTUUID                  59993cdb-45be-4b93-9540-b78a7f51fff8

====== osd.34 ======

[block]       /dev/ceph-0f336814-847e-447f-972c-2cd8c63b0f77/osd-block-ea178336-6a4b-482b-a2d5-0fc66ef681fb

block device              /dev/ceph-0f336814-847e-447f-972c-2cd8c63b0f77/osd-block-ea178336-6a4b-482b-a2d5-0fc66ef681fb
block uuid                GVUagq-1jUw-c6sZ-BFdm-tITt-PEGg-4OseEf
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdb7
db uuid                   b8b64246-95ee-4445-b8d6-3a9f37529da3
encrypted                 0
osd fsid                  ea178336-6a4b-482b-a2d5-0fc66ef681fb
osd id                    34
type                      block
vdo                       0
devices                   /dev/sdv1

[db]          /dev/sdb7

PARTUUID                  b8b64246-95ee-4445-b8d6-3a9f37529da3

====== osd.35 ======

[block]       /dev/ceph-8fede35f-e28f-4f13-8d42-1aace61d897c/osd-block-97a8568c-2967-42b1-8742-30dd9cd8824c

block device              /dev/ceph-8fede35f-e28f-4f13-8d42-1aace61d897c/osd-block-97a8568c-2967-42b1-8742-30dd9cd8824c
block uuid                UmNYY5-NvSo-XOkc-LrKL-axqe-iv0V-dHdY60
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdc7
db uuid                   7da659f5-5450-49ff-97a6-13ee4b14c6bb
encrypted                 0
osd fsid                  97a8568c-2967-42b1-8742-30dd9cd8824c
osd id                    35
type                      block
vdo                       0
devices                   /dev/sdw1

[db]          /dev/sdc7

PARTUUID                  7da659f5-5450-49ff-97a6-13ee4b14c6bb

====== osd.36 ======

[block]       /dev/ceph-2e5f9311-c1f3-4c77-81ae-75a94b8d5d21/osd-block-d833b309-ba8a-48cb-8ac3-09e90e4e08ec

block device              /dev/ceph-2e5f9311-c1f3-4c77-81ae-75a94b8d5d21/osd-block-d833b309-ba8a-48cb-8ac3-09e90e4e08ec
block uuid                E9SNde-WeNX-kBj5-1AGd-Lnmw-y05r-4XRaMa
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdb8
db uuid                   f9f135e6-addd-4e3e-8ee4-6e9e33a9e59e
encrypted                 0
osd fsid                  d833b309-ba8a-48cb-8ac3-09e90e4e08ec
osd id                    36
type                      block
vdo                       0
devices                   /dev/sdt1

[db]          /dev/sdb8

PARTUUID                  f9f135e6-addd-4e3e-8ee4-6e9e33a9e59e

====== osd.37 ======

[block]       /dev/ceph-4b01a555-55fe-4a46-88da-c76fb2dd6f2b/osd-block-d9546f83-c230-485e-930e-3d7bdbc8792e

block device              /dev/ceph-4b01a555-55fe-4a46-88da-c76fb2dd6f2b/osd-block-d9546f83-c230-485e-930e-3d7bdbc8792e
block uuid                5S2r6U-ZX8M-F1pB-Zuo6-LyXG-GVoq-asTxt5
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdc8
db uuid                   c9515e6f-370f-4b67-9a75-a797369651f7
encrypted                 0
osd fsid                  d9546f83-c230-485e-930e-3d7bdbc8792e
osd id                    37
type                      block
vdo                       0
devices                   /dev/sdu1

[db]          /dev/sdc8

PARTUUID                  c9515e6f-370f-4b67-9a75-a797369651f7

====== osd.38 ======

[block]       /dev/ceph-ec8f2baf-5ed3-4760-93d5-ec9d62a2919c/osd-block-472f923e-ef2c-43d1-8bfd-67dfd037bee6

block device              /dev/ceph-ec8f2baf-5ed3-4760-93d5-ec9d62a2919c/osd-block-472f923e-ef2c-43d1-8bfd-67dfd037bee6
block uuid                AMiEZ2-sPbW-FXfQ-VIiC-VMo6-vj5b-Vj2Lza
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdb9
db uuid                   cb22a904-8bf3-4141-8239-1df75b5b0ed6
encrypted                 0
osd fsid                  472f923e-ef2c-43d1-8bfd-67dfd037bee6
osd id                    38
type                      block
vdo                       0
devices                   /dev/sdr1

[db]          /dev/sdb9

PARTUUID                  cb22a904-8bf3-4141-8239-1df75b5b0ed6

====== osd.39 ======

[block]       /dev/ceph-804b46a9-736f-4c59-9e51-268e849bc0c6/osd-block-724549fb-db8b-4a18-b29b-84f8c61bdee3

block device              /dev/ceph-804b46a9-736f-4c59-9e51-268e849bc0c6/osd-block-724549fb-db8b-4a18-b29b-84f8c61bdee3
block uuid                yjZkNE-EuzY-fwwn-6c3F-ZfJu-oSEb-qK9wA8
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdc9
db uuid                   cfa5314c-4580-41e1-bd11-67699dc8c016
encrypted                 0
osd fsid                  724549fb-db8b-4a18-b29b-84f8c61bdee3
osd id                    39
type                      block
vdo                       0
devices                   /dev/sds1

[db]          /dev/sdc9

PARTUUID                  cfa5314c-4580-41e1-bd11-67699dc8c016

====== osd.40 ======

[block]       /dev/ceph-a7cf73d8-cf1f-4ee2-9bda-5b1568cd831e/osd-block-1c754e67-e826-419d-ae81-f82f20cc22e3

block device              /dev/ceph-a7cf73d8-cf1f-4ee2-9bda-5b1568cd831e/osd-block-1c754e67-e826-419d-ae81-f82f20cc22e3
block uuid                x6jMsX-Ha3o-Kxyi-vKMb-fdHl-bNqJ-hZjUjH
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdb10
db uuid                   0348f055-ee93-4518-b81d-358a0511bb53
encrypted                 0
osd fsid                  1c754e67-e826-419d-ae81-f82f20cc22e3
osd id                    40
type                      block
vdo                       0
devices                   /dev/sdp1

[db]          /dev/sdb10

PARTUUID                  0348f055-ee93-4518-b81d-358a0511bb53

====== osd.41 ======

[block]       /dev/ceph-bd3f925f-aebc-4e95-b9ae-375a10148898/osd-block-98e0c4bc-8db7-4fab-b2e2-b4c206c18070

block device              /dev/ceph-bd3f925f-aebc-4e95-b9ae-375a10148898/osd-block-98e0c4bc-8db7-4fab-b2e2-b4c206c18070
block uuid                sJB1x7-zJNv-h6jl-7NXC-YhjD-DSVe-9coCi4
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdc10
db uuid                   1df3d058-745d-4611-9fa6-6223e6c49a7e
encrypted                 0
osd fsid                  98e0c4bc-8db7-4fab-b2e2-b4c206c18070
osd id                    41
type                      block
vdo                       0
devices                   /dev/sdq1

[db]          /dev/sdc10

PARTUUID                  1df3d058-745d-4611-9fa6-6223e6c49a7e

====== osd.42 ======

[block]       /dev/ceph-a517c6e0-0edc-4497-b0b3-1b7eb5a2df74/osd-block-62921d5b-c23a-4272-a53b-a7ac72ddcfb3

block device              /dev/ceph-a517c6e0-0edc-4497-b0b3-1b7eb5a2df74/osd-block-62921d5b-c23a-4272-a53b-a7ac72ddcfb3
block uuid                sZ2aie-PR0c-WvTl-hxre-FZOB-1tJ9-nAcFay
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdb11
db uuid                   7d574e0c-26fa-4b1d-8af1-3d55d46b2234
encrypted                 0
osd fsid                  62921d5b-c23a-4272-a53b-a7ac72ddcfb3
osd id                    42
type                      block
vdo                       0
devices                   /dev/sdx1

[db]          /dev/sdb11

PARTUUID                  7d574e0c-26fa-4b1d-8af1-3d55d46b2234

====== osd.43 ======

[block]       /dev/ceph-39808dde-4448-4e21-9381-64386656e87c/osd-block-72d74711-3256-455c-b777-2f937f278961

block device              /dev/ceph-39808dde-4448-4e21-9381-64386656e87c/osd-block-72d74711-3256-455c-b777-2f937f278961
block uuid                Rdvinz-YEqF-pjnj-Hy6Z-WjIU-FdhB-YXVtGD
cephx lockbox secret
cluster fsid              1fd0cf11-6d7c-412d-abf9-515e6c83962c
cluster name              ceph
crush device class        None
db device                 /dev/sdc11
db uuid                   b8132096-e8fa-438a-9ceb-53e1e7398aea
encrypted                 0
osd fsid                  72d74711-3256-455c-b777-2f937f278961
osd id                    43
type                      block
vdo                       0
devices                   /dev/sdy1

[db]          /dev/sdc11

PARTUUID                  b8132096-e8fa-438a-9ceb-53e1e7398aea
root@vlab-ext-jfesx77-pvsa:~#

can you try

ceph-volume lvm activate --all

if that fails, try

lvm vgs -o name | grep ceph | xargs vgchange -ay
ceph-volume lvm activate --all

if this also fails, please post output

 

The ceph-volume command worked the first try!  All of the OSD came online.  I then tried rebooting the node again and the same issue occurred so I tried just the ceph-volume command, but I get the output below.  It ends up that I do have to run the lvm command first and then the ceph-volume command to get the OSD's all back online.  Thanks for all your help!!

root@vlab-ext-jfesx77-pvsa:~# ceph-volume lvm activate --all
--> OSD ID 22 FSID aa5bdf71-e94d-49f2-9e1e-33b9f95466e8 process is active. Skipping activation
--> OSD ID 32 FSID 4ca77ece-8a44-4528-9f41-3d0778694ce5 process is active. Skipping activation
--> OSD ID 24 FSID e6283510-3dfd-451e-b507-0576923f8797 process is active. Skipping activation
--> OSD ID 38 FSID 472f923e-ef2c-43d1-8bfd-67dfd037bee6 process is active. Skipping activation
--> OSD ID 29 FSID 64c7f755-dbbf-491e-be1c-aa95e477bb13 process is active. Skipping activation
--> Activating OSD ID 41 FSID 98e0c4bc-8db7-4fab-b2e2-b4c206c18070
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-41
Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-bd3f925f-aebc-4e95-b9ae-375a10148898/osd-block-98e0c4bc-8db7-4fab-b2e2-b4c206c18070 --path /var/lib/ceph/osd/ceph-41 --no-mon-config
stderr: failed to read label for
stderr: /dev/ceph-bd3f925f-aebc-4e95-b9ae-375a10148898/osd-block-98e0c4bc-8db7-4fab-b2e2-b4c206c18070
stderr: :
stderr: (2) No such file or directory
stderr:
-->  RuntimeError: command returned non-zero exit status: 1

it worked the first time, probably since i had asked you earlier to run the lvm command, so vg were already activated, however when you reboot, the vgs are not automatically activated hence the ceph-volume activation fails.

so the issue is why the vg are not activated on boot. i am not sure, but maybe this is done within initramfs on boot and maybe in your case in requires a driver/module to be included in initramfs..maybe.

another simpler solution is maybe if you can run the lvm acttivation command within the ceph-volume service pre-exec or as another custom single shot service and have ceph-volume depend/start after it.

if you do not mind waiting 1 or 2 days, i can send you a patch to help a bit with this.  we do have a failback script which works after boot to try to force start osds that failed to start for whatever reason, we can put the lvm activation command as part of that script. This is not a perfect solution as the osds will come up as failed and only after some 5 min or so will this kick in, if this is acceptable i will send you the patch, else the earlier proposed solution would be better.

Sure we would be interested in a patch like you describe, thanks!

 

download patch from

https://drive.google.com/open?id=1lz3AtJK1hnUGzlwVQFpkLiIsPSChKTaP

apply via:

patch -p1 -d / < force_activate_vgs.patch

as this is a fallback process, it may take from 2-5 min after reboot for this to kick in

Pages: 1 2