Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

New cluster not healty

Hello,

I finally setup my brand new cluster, also updated to the latest 2.5.1 version, but from the dashboard it does not appear to be healty and it does not get better with time, this is the message:

No this is not normal. If you just installed it with no real data, i would recommend trying to re-install it. Also if this does include the kernel you built yourself (prior posts) then it could be the issue.

Actually I built and replaced only the "atlantic.ko" module, could this cause the issue ? (well, to be more clear I built and installed in node 1, and replaced only the atlantic.ko file on node 2 and 3)

But even if I re-install all the three nodes I "must" use the new atlantic.ko module, otherwise the cluster would be useless...

 

The default pools created too many PGs for your OSD disk count. Most probably during cluster creation you specified a range of 15-50 disks while you had only 5.

To fix: manually delete the pools / filesystem and create new pools with smaller number of PGs ( total 256 PG in all )

Quote from admin on March 10, 2020, 5:50 pm

Most probably during cluster creation you specified a range of 15-50 disks while you had only 5.

Correct ! I choosed up to 50 disks because I currently have 3 hosts which can have up to 10 HDD each, now there are only 5 but I'm buying the other 25. Then, in the next months, I want to add a 4th node, so at the end I'll have 40 spinning disks and 4 NVMe SSD.

If now I select a smaller number of disks, then will I be able to increase it in the future ?

For now I'll follow your suggestion, there are no data yet in the cluster.

Thanks.

Fixed !   Now there's only this warning left, does it appear to be serious or not ?

BlueFS spillover detected on 2 OSD(s)

Thanks. Ste

Do the online upgrade then deleted the journal and OSD  then re-add the  OSDs:

to delete OSDs

systemctl stop ceph-osd@X

delete it from ui

re-add fro ui

If this is is not a new cluster, you need to delete the OSDs and journal 1 node at a time and do not do the other node until the cluster is OK healthy.

If his is a new cluster, you can download the latest iso and re-install.