Ceph Health: Too many PGs per OSD
Pages: 1 2
reto
6 Posts
February 20, 2018, 7:03 pmQuote from reto on February 20, 2018, 7:03 pmHi there,
we've installed a 5 node cluster (with each 5 disks (5OSD and 1 journal) and we've selected the cluster-size (50-100disks - perhaps this was the issue).
The installation went successfully, but the following message appeared on the dashboard:
Reduced data availability: 1237 pgs inactive
too many PGs per OSD (777 > max 300)
Degraded data redundancy: 1237 pgs unclean
2 slow requests are blocked > 32 sec
Afterwards we have added to every node one OSD disk more (totally 6 now), the issue with the unclean pgs was resolved.
The output looks now good until one message:
ceph osd -w --cluster peta-001-bit1
cluster:
id: e22c41d1-937a-4597-ba82-db706c0d9f53
health: HEALTH_WARN
too many PGs per OSD (491 > max 300)
services:
mon: 3 daemons, quorum cep-001-bit1,cep-002-bit1,cep-003-bit1
mgr: cep-002-bit1(active), standbys: cep-003-bit1, cep-001-bit1
osd: 25 osds: 25 up, 25 in
data:
pools: 1 pools, 4096 pgs
objects: 0 objects, 0 bytes
usage: 525 GB used, 6955 GB / 7481 GB avail
pgs: 4096 active+clean
Has this something to do with the cluster-size config, which we selected initial?
Many thanks in advance.
Best regards
Reto
Hi there,
we've installed a 5 node cluster (with each 5 disks (5OSD and 1 journal) and we've selected the cluster-size (50-100disks - perhaps this was the issue).
The installation went successfully, but the following message appeared on the dashboard:
Reduced data availability: 1237 pgs inactive
too many PGs per OSD (777 > max 300)
Degraded data redundancy: 1237 pgs unclean
2 slow requests are blocked > 32 sec
Afterwards we have added to every node one OSD disk more (totally 6 now), the issue with the unclean pgs was resolved.
The output looks now good until one message:
ceph osd -w --cluster peta-001-bit1
cluster:
id: e22c41d1-937a-4597-ba82-db706c0d9f53
health: HEALTH_WARN
too many PGs per OSD (491 > max 300)
services:
mon: 3 daemons, quorum cep-001-bit1,cep-002-bit1,cep-003-bit1
mgr: cep-002-bit1(active), standbys: cep-003-bit1, cep-001-bit1
osd: 25 osds: 25 up, 25 in
data:
pools: 1 pools, 4096 pgs
objects: 0 objects, 0 bytes
usage: 525 GB used, 6955 GB / 7481 GB avail
pgs: 4096 active+clean
Has this something to do with the cluster-size config, which we selected initial?
Many thanks in advance.
Best regards
Reto
Last edited on February 20, 2018, 8:57 pm by reto · #1
admin
2,930 Posts
February 20, 2018, 7:55 pmQuote from admin on February 20, 2018, 7:55 pmYes it is related. The initial selection of 50->200 disks results in 4096 PGs, it was better to choose 15->50 which results in 1024 PGs
Now you have 25 OSDs : each OSD has 4096 X 3 (replicas) / 25 = 491 PGs
The warning you see is because the upper limit is 300 PGs per OSD, this is why you see the warning. Your cluster will work but it puts too much stress on the OSD as it needs to synchronize all these with other peer OSDs.
The 15->50 disks selection would have resulted in 122 PGs per OSD which would be an ideal count.
It is not possible to decrease the PG count. It is possible to increase it (if expanding the cluster) but it will generate a lot of rebalance of stored data so it is really better to get it correct from the beginning. Ceph developers will in the future try to make this parameter flexible but currently you need to know beforehand.
If this is test cluster i would recommend re-install or maybe increase the disks to a 42 OSDs, so to be just below the 300 PG warning.
Yes it is related. The initial selection of 50->200 disks results in 4096 PGs, it was better to choose 15->50 which results in 1024 PGs
Now you have 25 OSDs : each OSD has 4096 X 3 (replicas) / 25 = 491 PGs
The warning you see is because the upper limit is 300 PGs per OSD, this is why you see the warning. Your cluster will work but it puts too much stress on the OSD as it needs to synchronize all these with other peer OSDs.
The 15->50 disks selection would have resulted in 122 PGs per OSD which would be an ideal count.
It is not possible to decrease the PG count. It is possible to increase it (if expanding the cluster) but it will generate a lot of rebalance of stored data so it is really better to get it correct from the beginning. Ceph developers will in the future try to make this parameter flexible but currently you need to know beforehand.
If this is test cluster i would recommend re-install or maybe increase the disks to a 42 OSDs, so to be just below the 300 PG warning.
Last edited on February 20, 2018, 8:02 pm by admin · #2
reto
6 Posts
February 20, 2018, 8:55 pmQuote from reto on February 20, 2018, 8:55 pmHi admin,
Many thanks for great detailed explanation!
Alright, we are going to reinstall PetaSAN, as we do not plan to increase the disk size in the near future.
Best regards and thanks again
Reto
Hi admin,
Many thanks for great detailed explanation!
Alright, we are going to reinstall PetaSAN, as we do not plan to increase the disk size in the near future.
Best regards and thanks again
Reto
khopkins
96 Posts
October 8, 2020, 6:41 pmQuote from khopkins on October 8, 2020, 6:41 pmHey,
Got the same situation but have been running this for over a year. Just got the warning of "too many PGs per OSD (357 > max 300)". Have 17 OSD's with 1024 pgs, now what? Is this going to hurt it?
Hey,
Got the same situation but have been running this for over a year. Just got the warning of "too many PGs per OSD (357 > max 300)". Have 17 OSD's with 1024 pgs, now what? Is this going to hurt it?
admin
2,930 Posts
October 8, 2020, 9:14 pmQuote from admin on October 8, 2020, 9:14 pmNot too alarming, some options:
1-ignore the warning
2-add approx 20% more osds
3-from the Ceph Configuration menu in ui, increase mon_max_pg_per_osd under mgr section from 300 to 360
4-decrease the pg count in your pools by 20%, note this will cause data rebalance
ceph osd pool set POOL pg_num XX
ceph osd pool set POOL pgp_num XX
Not too alarming, some options:
1-ignore the warning
2-add approx 20% more osds
3-from the Ceph Configuration menu in ui, increase mon_max_pg_per_osd under mgr section from 300 to 360
4-decrease the pg count in your pools by 20%, note this will cause data rebalance
ceph osd pool set POOL pg_num XX
ceph osd pool set POOL pgp_num XX
Last edited on October 8, 2020, 9:14 pm by admin · #5
khopkins
96 Posts
October 9, 2020, 12:49 pmQuote from khopkins on October 9, 2020, 12:49 pmwell changed the number to 360 yesterday and this morning, another line was there with 300 max, so now there are two lines for the same thing, but one of them I'm unable to make it change, it just keep coming back to 300.
mon_max_pg_per_osd =
300
mon_max_pg_per_osd =
360
Deleted the one with 300, which left the one with 360 listed. Refreshed and the two came back so unable to make changes
well changed the number to 360 yesterday and this morning, another line was there with 300 max, so now there are two lines for the same thing, but one of them I'm unable to make it change, it just keep coming back to 300.
mon_max_pg_per_osd =
300
mon_max_pg_per_osd =
360
Deleted the one with 300, which left the one with 360 listed. Refreshed and the two came back so unable to make changes
Last edited on October 9, 2020, 1:01 pm by khopkins · #6
admin
2,930 Posts
October 9, 2020, 1:39 pmQuote from admin on October 9, 2020, 1:39 pmCan you delete both keys, then re-add under mgr section. probably there was another global key from the upgrade.
Can you delete both keys, then re-add under mgr section. probably there was another global key from the upgrade.
khopkins
96 Posts
October 9, 2020, 1:51 pmQuote from khopkins on October 9, 2020, 1:51 pmIt won't stay deleted 🙂
Removed both statements and created one under the mgr, but it came back in global as 300. Guess we'll have to try #4 from your above options?
It won't stay deleted 🙂
Removed both statements and created one under the mgr, but it came back in global as 300. Guess we'll have to try #4 from your above options?
admin
2,930 Posts
October 9, 2020, 2:15 pmQuote from admin on October 9, 2020, 2:15 pmfrom cli run:
ceph config rm global mon_max_pg_per_osd
ceph config-key rm config/global/mon_max_pg_per_osd
then from ui delete key in both places, wait a few minutes, check it does not come back 🙂 then add the key from ui in mgr section
i suspect this could be due to the upgrade of conf file, could be a bug in ceph config assimilate-conf command we use.
from cli run:
ceph config rm global mon_max_pg_per_osd
ceph config-key rm config/global/mon_max_pg_per_osd
then from ui delete key in both places, wait a few minutes, check it does not come back 🙂 then add the key from ui in mgr section
i suspect this could be due to the upgrade of conf file, could be a bug in ceph config assimilate-conf command we use.
khopkins
96 Posts
October 9, 2020, 2:23 pmQuote from khopkins on October 9, 2020, 2:23 pmJust a question on this, would this result in any outages? If the statement cannot be given, will the system continue to run with it?
Just a question on this, would this result in any outages? If the statement cannot be given, will the system continue to run with it?
Pages: 1 2
Ceph Health: Too many PGs per OSD
reto
6 Posts
Quote from reto on February 20, 2018, 7:03 pmHi there,
we've installed a 5 node cluster (with each 5 disks (5OSD and 1 journal) and we've selected the cluster-size (50-100disks - perhaps this was the issue).
The installation went successfully, but the following message appeared on the dashboard:Reduced data availability: 1237 pgs inactive
too many PGs per OSD (777 > max 300)
Degraded data redundancy: 1237 pgs unclean
2 slow requests are blocked > 32 secAfterwards we have added to every node one OSD disk more (totally 6 now), the issue with the unclean pgs was resolved.
The output looks now good until one message:
ceph osd -w --cluster peta-001-bit1
cluster:
id: e22c41d1-937a-4597-ba82-db706c0d9f53
health: HEALTH_WARN
too many PGs per OSD (491 > max 300)services:
mon: 3 daemons, quorum cep-001-bit1,cep-002-bit1,cep-003-bit1
mgr: cep-002-bit1(active), standbys: cep-003-bit1, cep-001-bit1
osd: 25 osds: 25 up, 25 indata:
pools: 1 pools, 4096 pgs
objects: 0 objects, 0 bytes
usage: 525 GB used, 6955 GB / 7481 GB avail
pgs: 4096 active+clean
Has this something to do with the cluster-size config, which we selected initial?
Many thanks in advance.
Best regards
Reto
Hi there,
we've installed a 5 node cluster (with each 5 disks (5OSD and 1 journal) and we've selected the cluster-size (50-100disks - perhaps this was the issue).
The installation went successfully, but the following message appeared on the dashboard:
Reduced data availability: 1237 pgs inactive
too many PGs per OSD (777 > max 300)
Degraded data redundancy: 1237 pgs unclean
2 slow requests are blocked > 32 sec
Afterwards we have added to every node one OSD disk more (totally 6 now), the issue with the unclean pgs was resolved.
The output looks now good until one message:
ceph osd -w --cluster peta-001-bit1
cluster:
id: e22c41d1-937a-4597-ba82-db706c0d9f53
health: HEALTH_WARN
too many PGs per OSD (491 > max 300)
services:
mon: 3 daemons, quorum cep-001-bit1,cep-002-bit1,cep-003-bit1
mgr: cep-002-bit1(active), standbys: cep-003-bit1, cep-001-bit1
osd: 25 osds: 25 up, 25 in
data:
pools: 1 pools, 4096 pgs
objects: 0 objects, 0 bytes
usage: 525 GB used, 6955 GB / 7481 GB avail
pgs: 4096 active+clean
Has this something to do with the cluster-size config, which we selected initial?
Many thanks in advance.
Best regards
Reto
admin
2,930 Posts
Quote from admin on February 20, 2018, 7:55 pmYes it is related. The initial selection of 50->200 disks results in 4096 PGs, it was better to choose 15->50 which results in 1024 PGs
Now you have 25 OSDs : each OSD has 4096 X 3 (replicas) / 25 = 491 PGs
The warning you see is because the upper limit is 300 PGs per OSD, this is why you see the warning. Your cluster will work but it puts too much stress on the OSD as it needs to synchronize all these with other peer OSDs.
The 15->50 disks selection would have resulted in 122 PGs per OSD which would be an ideal count.
It is not possible to decrease the PG count. It is possible to increase it (if expanding the cluster) but it will generate a lot of rebalance of stored data so it is really better to get it correct from the beginning. Ceph developers will in the future try to make this parameter flexible but currently you need to know beforehand.
If this is test cluster i would recommend re-install or maybe increase the disks to a 42 OSDs, so to be just below the 300 PG warning.
Yes it is related. The initial selection of 50->200 disks results in 4096 PGs, it was better to choose 15->50 which results in 1024 PGs
Now you have 25 OSDs : each OSD has 4096 X 3 (replicas) / 25 = 491 PGs
The warning you see is because the upper limit is 300 PGs per OSD, this is why you see the warning. Your cluster will work but it puts too much stress on the OSD as it needs to synchronize all these with other peer OSDs.
The 15->50 disks selection would have resulted in 122 PGs per OSD which would be an ideal count.
It is not possible to decrease the PG count. It is possible to increase it (if expanding the cluster) but it will generate a lot of rebalance of stored data so it is really better to get it correct from the beginning. Ceph developers will in the future try to make this parameter flexible but currently you need to know beforehand.
If this is test cluster i would recommend re-install or maybe increase the disks to a 42 OSDs, so to be just below the 300 PG warning.
reto
6 Posts
Quote from reto on February 20, 2018, 8:55 pmHi admin,
Many thanks for great detailed explanation!
Alright, we are going to reinstall PetaSAN, as we do not plan to increase the disk size in the near future.Best regards and thanks again
Reto
Hi admin,
Many thanks for great detailed explanation!
Alright, we are going to reinstall PetaSAN, as we do not plan to increase the disk size in the near future.
Best regards and thanks again
Reto
khopkins
96 Posts
Quote from khopkins on October 8, 2020, 6:41 pmHey,
Got the same situation but have been running this for over a year. Just got the warning of "too many PGs per OSD (357 > max 300)". Have 17 OSD's with 1024 pgs, now what? Is this going to hurt it?
Hey,
Got the same situation but have been running this for over a year. Just got the warning of "too many PGs per OSD (357 > max 300)". Have 17 OSD's with 1024 pgs, now what? Is this going to hurt it?
admin
2,930 Posts
Quote from admin on October 8, 2020, 9:14 pmNot too alarming, some options:
1-ignore the warning
2-add approx 20% more osds
3-from the Ceph Configuration menu in ui, increase mon_max_pg_per_osd under mgr section from 300 to 360
4-decrease the pg count in your pools by 20%, note this will cause data rebalance
ceph osd pool set POOL pg_num XX
ceph osd pool set POOL pgp_num XX
Not too alarming, some options:
1-ignore the warning
2-add approx 20% more osds
3-from the Ceph Configuration menu in ui, increase mon_max_pg_per_osd under mgr section from 300 to 360
4-decrease the pg count in your pools by 20%, note this will cause data rebalance
ceph osd pool set POOL pg_num XX
ceph osd pool set POOL pgp_num XX
khopkins
96 Posts
Quote from khopkins on October 9, 2020, 12:49 pmwell changed the number to 360 yesterday and this morning, another line was there with 300 max, so now there are two lines for the same thing, but one of them I'm unable to make it change, it just keep coming back to 300.
mon_max_pg_per_osd =
300
mon_max_pg_per_osd =
360
Deleted the one with 300, which left the one with 360 listed. Refreshed and the two came back so unable to make changes
well changed the number to 360 yesterday and this morning, another line was there with 300 max, so now there are two lines for the same thing, but one of them I'm unable to make it change, it just keep coming back to 300.
mon_max_pg_per_osd =
300
mon_max_pg_per_osd =
360
Deleted the one with 300, which left the one with 360 listed. Refreshed and the two came back so unable to make changes
admin
2,930 Posts
Quote from admin on October 9, 2020, 1:39 pmCan you delete both keys, then re-add under mgr section. probably there was another global key from the upgrade.
Can you delete both keys, then re-add under mgr section. probably there was another global key from the upgrade.
khopkins
96 Posts
Quote from khopkins on October 9, 2020, 1:51 pmIt won't stay deleted 🙂
Removed both statements and created one under the mgr, but it came back in global as 300. Guess we'll have to try #4 from your above options?
It won't stay deleted 🙂
Removed both statements and created one under the mgr, but it came back in global as 300. Guess we'll have to try #4 from your above options?
admin
2,930 Posts
Quote from admin on October 9, 2020, 2:15 pmfrom cli run:
ceph config rm global mon_max_pg_per_osd
ceph config-key rm config/global/mon_max_pg_per_osdthen from ui delete key in both places, wait a few minutes, check it does not come back 🙂 then add the key from ui in mgr section
i suspect this could be due to the upgrade of conf file, could be a bug in ceph config assimilate-conf command we use.
from cli run:
ceph config rm global mon_max_pg_per_osd
ceph config-key rm config/global/mon_max_pg_per_osd
then from ui delete key in both places, wait a few minutes, check it does not come back 🙂 then add the key from ui in mgr section
i suspect this could be due to the upgrade of conf file, could be a bug in ceph config assimilate-conf command we use.
khopkins
96 Posts
Quote from khopkins on October 9, 2020, 2:23 pmJust a question on this, would this result in any outages? If the statement cannot be given, will the system continue to run with it?
Just a question on this, would this result in any outages? If the statement cannot be given, will the system continue to run with it?