Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Vlan configuration

Pages: 1 2 3 4 5
Quote from admin on March 13, 2018, 6:08 pm

Hi,

Can you run the du command to see how much data these OSDs have for the problem PG.

Sadly 0 bytes for all of them

 

The original copies of PG 1.e0e were on 52,56,65 so they are all gone. The current assigned OSDs 2,35,23 do not have any copies. it does not look good. We can tell Ceph to forget about trying to find the data but will end up with empty data for this PG, the disks will have 1/4096 of data lost and the file system may or may not repair.

I'm okay with data loss. This is just backup of backup.

Do you want start with a fresh new pool ( erase all ) or maintain existing with some pg data gone ?

I wouldn't mind attempting to repair, because it will reduce the amount of data I need to ship, but if it doesn't work, then an erase all will be option 2.

According to: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/012771.html
The general consensus from those threads is that as long as down_osds_we_would_probe is pointing to any OSD that can't be reached, those PGs will remain stuck incomplete and can't be cured by force_create_pg or even "ceph osd lost".
So we need to create new empty OSDs with the same IDs of the stuck OSD 52,56,65
The command to create an OSD with specific id, is quite lengthy:
http://docs.ceph.com/docs/jewel/install/manual-deployment/
see the ADDING OSDS -> LONG FORM section

The fresh approach (delete all) is

ceph osd pool delete rbd rbd --yes-i-really-really-mean-it --cluster CLUSTER_NAME
ceph osd pool create rbd 4096 4096 --cluster CLUSTER_NAME
ceph osd pool set rbd size 3 --cluster CLUSTER_NAME
ceph osd pool set rbd min_size 2 --cluster CLUSTER_NAME
ceph osd pool application enable rbd rbd --cluster CLUSTER_NAME

Thanks. I looked at the manual creation option, but there were too many unknowns in there for me, so I've opted for the delete all.

Question - I'm unable to create any OSDs - it says they are creating but they don't show up. Is this because of the hybrid 1.4/2.0 state I'm in? Should I complete migrating my nodes all to 2.0 before creating any OIDs (all of the deleted drives), or do I have another problem stopping me from creating OSDs?

Is this happening on 2.0 nodes or 1.4 nodes ?

When you upgrade an existing 1.4 node to 2.0 : do all OSDs come up, or some or none ?

Pages: 1 2 3 4 5