Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Update 3.2 issues

Hello,

After the update to 3.2 the osd's are not updating.

When I (re)try to execute the update

ceph osd require-osd-release quincy

ceph stops working / hungs for a long time

 

mon.Ceph02 (mon.2) 8959 : cluster [INF] disallowing boot of quincy+ OSD osd.21 v2:10.0.1.13:6826/153756 because require_osd_release < octopus

 

root@Ceph01:~# ceph -s
cluster:
id: xxxxxxxxxxxxxxxxxxxxxxxxxx
health: HEALTH_WARN
1 filesystem is degraded
1/3 mons down, quorum Ceph01,Ceph02
noout flag(s) set
3 osds down
all OSDs are running octopus or later but require_osd_release < octopus
Reduced data availability: 4385 pgs inactive, 347 pgs down, 2514 pgs peering
Degraded data redundancy: 32475412/259722717 objects degraded (12.504%), 576 pgs degraded, 582 pgs undersized
6381 slow ops, oldest one blocked for 4570 sec, mon.Ceph02 has slow ops

services:
mon: 3 daemons, quorum Ceph01,Ceph02 (age 0.366662s), out of quorum: Ceph03
mgr: Ceph03(active, since 5h), standbys: Ceph02, Ceph01
mds: 1/1 daemons up, 2 standby
osd: 66 osds: 19 up (since 6h), 22 in (since 6h); 1794 remapped pgs
flags noout

data:
volumes: 0/1 healthy, 1 recovering
pools: 4 pools, 4385 pgs
objects: 86.57M objects, 193 TiB
usage: 278 TiB used, 84 TiB / 361 TiB avail
pgs: 21.482% pgs unknown
78.518% pgs not active
32475412/259722717 objects degraded (12.504%)
2514 peering
942 unknown
576 undersized+degraded+peered
347 down
6 undersized+peered

 

 

Tried restarting the nodes/osd's but no luck so far.. Any idea's?

 

 

 

Hello,

i have exactly the same error pattern.

We initially set up PetaSAN with version 2.1.0 and have since installed every available update.

All OSD's of all OSD nodes are listed as "down, in".

All OSD nodes have the operating system "Ubuntu v20.04.6 LTS (Focal Fossa)" installed and all packages are up to date.

The ceph version on the OSD nodes is "ceph/petasan-v3,now 17.2.5-1petasan amd64 [installed]".

The last update command "ceph osd require-osd-release quincy" is stuck. All OSDs are not being recognized by the cluster, although all OSD daemons have started successfully and they have no errors.
It seems to me that the monitor nodes can no longer communicate with the OSD nodes. I can rule out network problems.

I need urgent help please!

Thanks very much.

Support is looking into my issue.

Can you post the output of :

ceph status
ceph osd dump | grep release
ceph versions

Our Petasan is running again.

Support compiled a modified ceph-mon binary who was able to ignore the wrong version of the OSD's.

if it is the same issue, it is related to
https://www.spinics.net/lists/ceph-users/msg74089.html

it happens if the cluster was updated from initially old release. Note that we will provide a fix in our upgrade script to handle this automatically. So the following is just for the case you already upgraded an have the issue.

download modified monitor binary
wget https://www.petasan.org/fixes/320/ceph-mon.gz
gunzip ceph-mon.gz

On first 3 nodes

mv /usr/bin/ceph-mon /usr/bin/ceph-mon-orig
chmod +x ceph-mon
cp ceph-mon /usr/bin
ln -s /usr/lib/x86_64-linux-gnu/ceph/libceph-common.so.2 /usr/lib/x86_64-linux-gnu/libceph-common.so.2
systemctl restart ceph-mon.target

make sure the 3 new monitors started and are in quorum using
ceph status
if yes then
ceph osd require-osd-release octopus
if all goes well, the following will command show octopus
ceph osd dump | grep release

if all ok, reboot all nodes in cluster
Only when all OSDs are up and all PGs are active/clean and the only issue is
all OSDs are running quincy or later but require_osd_release < quincy
then at this point
ceph osd require-osd-release quincy

Again we will fix the upgrade scripts to automatically handle this without need of this binary.

upgrade sctipt has been modified to deal with this case. the script is dynamically fetched online, so no changes are locally required.