Forums - PetaSAN

ForumGeneral DiscussionCeph Health: Too many PGs per OSD
You need to log in to create posts and topics. Login · Register
Ceph Health: Too many PGs per OSD

Pages: 1 2

reto
6 Posts

February 20, 2018, 7:03 pm
Quote from reto on February 20, 2018, 7:03 pm
Hi there,

we've installed a 5 node cluster (with each 5 disks (5OSD and 1 journal) and we've selected the cluster-size (50-100disks - perhaps this was the issue).
The installation went successfully, but the following message appeared on the dashboard:

Reduced data availability: 1237 pgs inactive
too many PGs per OSD (777 > max 300)
Degraded data redundancy: 1237 pgs unclean
2 slow requests are blocked > 32 sec

Afterwards we have added to every node one OSD disk more (totally 6 now), the issue with the unclean pgs was resolved.
The output looks now good until one message:

ceph osd -w --cluster peta-001-bit1

cluster:
id: e22c41d1-937a-4597-ba82-db706c0d9f53
health: HEALTH_WARN
too many PGs per OSD (491 > max 300)

services:
mon: 3 daemons, quorum cep-001-bit1,cep-002-bit1,cep-003-bit1
mgr: cep-002-bit1(active), standbys: cep-003-bit1, cep-001-bit1
osd: 25 osds: 25 up, 25 in

data:
pools: 1 pools, 4096 pgs
objects: 0 objects, 0 bytes
usage: 525 GB used, 6955 GB / 7481 GB avail
pgs: 4096 active+clean

Has this something to do with the cluster-size config, which we selected initial?

Many thanks in advance.

Best regards
Reto

Hi there,

we've installed a 5 node cluster (with each 5 disks (5OSD and 1 journal) and we've selected the cluster-size (50-100disks - perhaps this was the issue).
The installation went successfully, but the following message appeared on the dashboard:

Reduced data availability: 1237 pgs inactive
too many PGs per OSD (777 > max 300)
Degraded data redundancy: 1237 pgs unclean
2 slow requests are blocked > 32 sec

Afterwards we have added to every node one OSD disk more (totally 6 now), the issue with the unclean pgs was resolved.
The output looks now good until one message:

ceph osd -w --cluster peta-001-bit1

cluster:
id: e22c41d1-937a-4597-ba82-db706c0d9f53
health: HEALTH_WARN
too many PGs per OSD (491 > max 300)

services:
mon: 3 daemons, quorum cep-001-bit1,cep-002-bit1,cep-003-bit1
mgr: cep-002-bit1(active), standbys: cep-003-bit1, cep-001-bit1
osd: 25 osds: 25 up, 25 in

data:
pools: 1 pools, 4096 pgs
objects: 0 objects, 0 bytes
usage: 525 GB used, 6955 GB / 7481 GB avail
pgs: 4096 active+clean

Has this something to do with the cluster-size config, which we selected initial?

Many thanks in advance.

Best regards
Reto

Last edited on February 20, 2018, 8:57 pm by reto · #1

admin
2,930 Posts

February 20, 2018, 7:55 pm
Quote from admin on February 20, 2018, 7:55 pm
Yes it is related. The initial selection of 50->200 disks results in 4096 PGs, it was better to choose 15->50 which results in 1024 PGs

Now you have 25 OSDs : each OSD has 4096 X 3 (replicas) / 25 = 491 PGs

The warning you see is because the upper limit is 300 PGs per OSD, this is why you see the warning. Your cluster will work but it puts too much stress on the OSD as it needs to synchronize all these with other peer OSDs.

The 15->50 disks selection would have resulted in 122 PGs per OSD which would be an ideal count.

It is not possible to decrease the PG count. It is possible to increase it (if expanding the cluster) but it will generate a lot of rebalance of stored data so it is really better to get it correct from the beginning. Ceph developers will in the future try to make this parameter flexible but currently you need to know beforehand.

If this is test cluster i would recommend re-install or maybe increase the disks to a 42 OSDs, so to be just below the 300 PG warning.

Yes it is related. The initial selection of 50->200 disks results in 4096 PGs, it was better to choose 15->50 which results in 1024 PGs

Now you have 25 OSDs : each OSD has 4096 X 3 (replicas) / 25 = 491 PGs

The warning you see is because the upper limit is 300 PGs per OSD, this is why you see the warning. Your cluster will work but it puts too much stress on the OSD as it needs to synchronize all these with other peer OSDs.

The 15->50 disks selection would have resulted in 122 PGs per OSD which would be an ideal count.

It is not possible to decrease the PG count. It is possible to increase it (if expanding the cluster) but it will generate a lot of rebalance of stored data so it is really better to get it correct from the beginning. Ceph developers will in the future try to make this parameter flexible but currently you need to know beforehand.

If this is test cluster i would recommend re-install or maybe increase the disks to a 42 OSDs, so to be just below the 300 PG warning.

Last edited on February 20, 2018, 8:02 pm by admin · #2

reto
6 Posts

February 20, 2018, 8:55 pm
Quote from reto on February 20, 2018, 8:55 pm
Hi admin,

Many thanks for great detailed explanation!
Alright, we are going to reinstall PetaSAN, as we do not plan to increase the disk size in the near future.

Best regards and thanks again
Reto

Hi admin,

Many thanks for great detailed explanation!
Alright, we are going to reinstall PetaSAN, as we do not plan to increase the disk size in the near future.

Best regards and thanks again
Reto

#3

khopkins
96 Posts

October 8, 2020, 6:41 pm
Quote from khopkins on October 8, 2020, 6:41 pm
Hey,

Got the same situation but have been running this for over a year. Just got the warning of "too many PGs per OSD (357 > max 300)". Have 17 OSD's with 1024 pgs, now what? Is this going to hurt it?

Hey,

Got the same situation but have been running this for over a year. Just got the warning of "too many PGs per OSD (357 > max 300)". Have 17 OSD's with 1024 pgs, now what? Is this going to hurt it?

#4

admin
2,930 Posts

October 8, 2020, 9:14 pm
Quote from admin on October 8, 2020, 9:14 pm
Not too alarming, some options:

1-ignore the warning

2-add approx 20% more osds

3-from the Ceph Configuration menu in ui, increase mon_max_pg_per_osd under mgr section from 300 to 360

4-decrease the pg count in your pools by 20%, note this will cause data rebalance
ceph osd pool set POOL pg_num XX
ceph osd pool set POOL pgp_num XX

Not too alarming, some options:

1-ignore the warning

2-add approx 20% more osds

3-from the Ceph Configuration menu in ui, increase mon_max_pg_per_osd under mgr section from 300 to 360

4-decrease the pg count in your pools by 20%, note this will cause data rebalance
ceph osd pool set POOL pg_num XX
ceph osd pool set POOL pgp_num XX

Last edited on October 8, 2020, 9:14 pm by admin · #5

khopkins
96 Posts

October 9, 2020, 12:49 pm
Quote from khopkins on October 9, 2020, 12:49 pm
well changed the number to 360 yesterday and this morning, another line was there with 300 max, so now there are two lines for the same thing, but one of them I'm unable to make it change, it just keep coming back to 300.

mon_max_pg_per_osd =

300

mon_max_pg_per_osd =

360

Deleted the one with 300, which left the one with 360 listed. Refreshed and the two came back so unable to make changes

well changed the number to 360 yesterday and this morning, another line was there with 300 max, so now there are two lines for the same thing, but one of them I'm unable to make it change, it just keep coming back to 300.

mon_max_pg_per_osd =

300

mon_max_pg_per_osd =

360

Deleted the one with 300, which left the one with 360 listed. Refreshed and the two came back so unable to make changes

Last edited on October 9, 2020, 1:01 pm by khopkins · #6

admin
2,930 Posts

October 9, 2020, 1:39 pm
Quote from admin on October 9, 2020, 1:39 pm
Can you delete both keys, then re-add under mgr section. probably there was another global key from the upgrade.

Can you delete both keys, then re-add under mgr section. probably there was another global key from the upgrade.

#7

khopkins
96 Posts

October 9, 2020, 1:51 pm
Quote from khopkins on October 9, 2020, 1:51 pm
It won't stay deleted 🙂

Removed both statements and created one under the mgr, but it came back in global as 300. Guess we'll have to try #4 from your above options?

It won't stay deleted 🙂

Removed both statements and created one under the mgr, but it came back in global as 300. Guess we'll have to try #4 from your above options?

#8

admin
2,930 Posts

October 9, 2020, 2:15 pm
Quote from admin on October 9, 2020, 2:15 pm
from cli run:

ceph config rm global mon_max_pg_per_osd
ceph config-key rm config/global/mon_max_pg_per_osd

then from ui delete key in both places, wait a few minutes, check it does not come back 🙂 then add the key from ui in mgr section

i suspect this could be due to the upgrade of conf file, could be a bug in ceph config assimilate-conf command we use.

from cli run:

ceph config rm global mon_max_pg_per_osd
ceph config-key rm config/global/mon_max_pg_per_osd

then from ui delete key in both places, wait a few minutes, check it does not come back 🙂 then add the key from ui in mgr section

i suspect this could be due to the upgrade of conf file, could be a bug in ceph config assimilate-conf command we use.

#9

khopkins
96 Posts

October 9, 2020, 2:23 pm
Quote from khopkins on October 9, 2020, 2:23 pm
Just a question on this, would this result in any outages? If the statement cannot be given, will the system continue to run with it?

Just a question on this, would this result in any outages? If the statement cannot be given, will the system continue to run with it?

#10

Post Reply: Ceph Health: Too many PGs per OSD

Cancel

Pages: 1 2