New cluster not healty
Ste
125 Posts
March 10, 2020, 10:54 amQuote from Ste on March 10, 2020, 10:54 amHello,
I finally setup my brand new cluster, also updated to the latest 2.5.1 version, but from the dashboard it does not appear to be healty and it does not get better with time, this is the message:
Ceph Health
I'm going through the Administration guide, but it doesn't seem I messed up something... What can I do to fix the situation ?
Moreover, I think it is not nornal to have two inactive pools (see picture), isn't it ?
Thanks.
Hello,
I finally setup my brand new cluster, also updated to the latest 2.5.1 version, but from the dashboard it does not appear to be healty and it does not get better with time, this is the message:
Ceph Health
I'm going through the Administration guide, but it doesn't seem I messed up something... What can I do to fix the situation ?
Moreover, I think it is not nornal to have two inactive pools (see picture), isn't it ?
Thanks.
Last edited on March 10, 2020, 10:55 am by Ste · #1
admin
2,930 Posts
March 10, 2020, 1:04 pmQuote from admin on March 10, 2020, 1:04 pmNo this is not normal. If you just installed it with no real data, i would recommend trying to re-install it. Also if this does include the kernel you built yourself (prior posts) then it could be the issue.
No this is not normal. If you just installed it with no real data, i would recommend trying to re-install it. Also if this does include the kernel you built yourself (prior posts) then it could be the issue.
Ste
125 Posts
March 10, 2020, 4:12 pmQuote from Ste on March 10, 2020, 4:12 pmActually I built and replaced only the "atlantic.ko" module, could this cause the issue ? (well, to be more clear I built and installed in node 1, and replaced only the atlantic.ko file on node 2 and 3)
But even if I re-install all the three nodes I "must" use the new atlantic.ko module, otherwise the cluster would be useless...
Actually I built and replaced only the "atlantic.ko" module, could this cause the issue ? (well, to be more clear I built and installed in node 1, and replaced only the atlantic.ko file on node 2 and 3)
But even if I re-install all the three nodes I "must" use the new atlantic.ko module, otherwise the cluster would be useless...
Last edited on March 10, 2020, 4:23 pm by Ste · #3
admin
2,930 Posts
March 10, 2020, 5:50 pmQuote from admin on March 10, 2020, 5:50 pmThe default pools created too many PGs for your OSD disk count. Most probably during cluster creation you specified a range of 15-50 disks while you had only 5.
To fix: manually delete the pools / filesystem and create new pools with smaller number of PGs ( total 256 PG in all )
The default pools created too many PGs for your OSD disk count. Most probably during cluster creation you specified a range of 15-50 disks while you had only 5.
To fix: manually delete the pools / filesystem and create new pools with smaller number of PGs ( total 256 PG in all )
Ste
125 Posts
March 10, 2020, 6:36 pmQuote from Ste on March 10, 2020, 6:36 pm
Quote from admin on March 10, 2020, 5:50 pm
Most probably during cluster creation you specified a range of 15-50 disks while you had only 5.
Correct ! I choosed up to 50 disks because I currently have 3 hosts which can have up to 10 HDD each, now there are only 5 but I'm buying the other 25. Then, in the next months, I want to add a 4th node, so at the end I'll have 40 spinning disks and 4 NVMe SSD.
If now I select a smaller number of disks, then will I be able to increase it in the future ?
For now I'll follow your suggestion, there are no data yet in the cluster.
Thanks.
Quote from admin on March 10, 2020, 5:50 pm
Most probably during cluster creation you specified a range of 15-50 disks while you had only 5.
Correct ! I choosed up to 50 disks because I currently have 3 hosts which can have up to 10 HDD each, now there are only 5 but I'm buying the other 25. Then, in the next months, I want to add a 4th node, so at the end I'll have 40 spinning disks and 4 NVMe SSD.
If now I select a smaller number of disks, then will I be able to increase it in the future ?
For now I'll follow your suggestion, there are no data yet in the cluster.
Thanks.
Ste
125 Posts
March 11, 2020, 12:03 pmQuote from Ste on March 11, 2020, 12:03 pmFixed ! Now there's only this warning left, does it appear to be serious or not ?
BlueFS spillover detected on 2 OSD(s)
Thanks. Ste
Fixed ! Now there's only this warning left, does it appear to be serious or not ?
BlueFS spillover detected on 2 OSD(s)
Thanks. Ste
admin
2,930 Posts
March 11, 2020, 12:16 pmQuote from admin on March 11, 2020, 12:16 pmDo the online upgrade then deleted the journal and OSD then re-add the OSDs:
to delete OSDs
systemctl stop ceph-osd@X
delete it from ui
re-add fro ui
If this is is not a new cluster, you need to delete the OSDs and journal 1 node at a time and do not do the other node until the cluster is OK healthy.
If his is a new cluster, you can download the latest iso and re-install.
Do the online upgrade then deleted the journal and OSD then re-add the OSDs:
to delete OSDs
systemctl stop ceph-osd@X
delete it from ui
re-add fro ui
If this is is not a new cluster, you need to delete the OSDs and journal 1 node at a time and do not do the other node until the cluster is OK healthy.
If his is a new cluster, you can download the latest iso and re-install.
Last edited on March 11, 2020, 12:25 pm by admin · #7
New cluster not healty
Ste
125 Posts
Quote from Ste on March 10, 2020, 10:54 amHello,
I finally setup my brand new cluster, also updated to the latest 2.5.1 version, but from the dashboard it does not appear to be healty and it does not get better with time, this is the message:
Ceph Health
I'm going through the Administration guide, but it doesn't seem I messed up something... What can I do to fix the situation ?Moreover, I think it is not nornal to have two inactive pools (see picture), isn't it ?Thanks.
Hello,
I finally setup my brand new cluster, also updated to the latest 2.5.1 version, but from the dashboard it does not appear to be healty and it does not get better with time, this is the message:
Ceph Health
admin
2,930 Posts
Quote from admin on March 10, 2020, 1:04 pmNo this is not normal. If you just installed it with no real data, i would recommend trying to re-install it. Also if this does include the kernel you built yourself (prior posts) then it could be the issue.
No this is not normal. If you just installed it with no real data, i would recommend trying to re-install it. Also if this does include the kernel you built yourself (prior posts) then it could be the issue.
Ste
125 Posts
Quote from Ste on March 10, 2020, 4:12 pmActually I built and replaced only the "atlantic.ko" module, could this cause the issue ? (well, to be more clear I built and installed in node 1, and replaced only the atlantic.ko file on node 2 and 3)
But even if I re-install all the three nodes I "must" use the new atlantic.ko module, otherwise the cluster would be useless...
Actually I built and replaced only the "atlantic.ko" module, could this cause the issue ? (well, to be more clear I built and installed in node 1, and replaced only the atlantic.ko file on node 2 and 3)
But even if I re-install all the three nodes I "must" use the new atlantic.ko module, otherwise the cluster would be useless...
admin
2,930 Posts
Quote from admin on March 10, 2020, 5:50 pmThe default pools created too many PGs for your OSD disk count. Most probably during cluster creation you specified a range of 15-50 disks while you had only 5.
To fix: manually delete the pools / filesystem and create new pools with smaller number of PGs ( total 256 PG in all )
The default pools created too many PGs for your OSD disk count. Most probably during cluster creation you specified a range of 15-50 disks while you had only 5.
To fix: manually delete the pools / filesystem and create new pools with smaller number of PGs ( total 256 PG in all )
Ste
125 Posts
Quote from Ste on March 10, 2020, 6:36 pmQuote from admin on March 10, 2020, 5:50 pmMost probably during cluster creation you specified a range of 15-50 disks while you had only 5.
Correct ! I choosed up to 50 disks because I currently have 3 hosts which can have up to 10 HDD each, now there are only 5 but I'm buying the other 25. Then, in the next months, I want to add a 4th node, so at the end I'll have 40 spinning disks and 4 NVMe SSD.
If now I select a smaller number of disks, then will I be able to increase it in the future ?
For now I'll follow your suggestion, there are no data yet in the cluster.
Thanks.
Quote from admin on March 10, 2020, 5:50 pmMost probably during cluster creation you specified a range of 15-50 disks while you had only 5.
Correct ! I choosed up to 50 disks because I currently have 3 hosts which can have up to 10 HDD each, now there are only 5 but I'm buying the other 25. Then, in the next months, I want to add a 4th node, so at the end I'll have 40 spinning disks and 4 NVMe SSD.
If now I select a smaller number of disks, then will I be able to increase it in the future ?
For now I'll follow your suggestion, there are no data yet in the cluster.
Thanks.
Ste
125 Posts
Quote from Ste on March 11, 2020, 12:03 pmFixed ! Now there's only this warning left, does it appear to be serious or not ?
BlueFS spillover detected on 2 OSD(s)
Thanks. Ste
Fixed ! Now there's only this warning left, does it appear to be serious or not ?
BlueFS spillover detected on 2 OSD(s)
Thanks. Ste
admin
2,930 Posts
Quote from admin on March 11, 2020, 12:16 pmDo the online upgrade then deleted the journal and OSD then re-add the OSDs:
to delete OSDs
systemctl stop ceph-osd@X
delete it from ui
re-add fro ui
If this is is not a new cluster, you need to delete the OSDs and journal 1 node at a time and do not do the other node until the cluster is OK healthy.
If his is a new cluster, you can download the latest iso and re-install.
Do the online upgrade then deleted the journal and OSD then re-add the OSDs:
to delete OSDs
systemctl stop ceph-osd@X
delete it from ui
re-add fro ui
If this is is not a new cluster, you need to delete the OSDs and journal 1 node at a time and do not do the other node until the cluster is OK healthy.
If his is a new cluster, you can download the latest iso and re-install.