Pool have low avaible Space
Mattis
7 Posts
July 1, 2024, 9:19 amQuote from Mattis on July 1, 2024, 9:19 amWe have a PETASAN Cluster with 14 TB Available but our Pools only have 50 GB Available Space.
We have and NFS and some iscsi Disk one our SAN.
We have a PETASAN Cluster with 14 TB Available but our Pools only have 50 GB Available Space.
We have and NFS and some iscsi Disk one our SAN.
admin
2,930 Posts
July 1, 2024, 11:20 amQuote from admin on July 1, 2024, 11:20 amThis could happen if you have inbalance among your OSDs, the pool available size will be limited by its most filled OSD. Look at the dashboard chart for top OSD filled %, also running the command
ceph osd df
will show you variations among OSDs. You should enable the ceph balancer from UI, pg autoscaler will also help to make sure you have enough pgs for balancing. For short term fixes you can also change the osd crush weights also from UI to get better balance.
This could happen if you have inbalance among your OSDs, the pool available size will be limited by its most filled OSD. Look at the dashboard chart for top OSD filled %, also running the command
ceph osd df
will show you variations among OSDs. You should enable the ceph balancer from UI, pg autoscaler will also help to make sure you have enough pgs for balancing. For short term fixes you can also change the osd crush weights also from UI to get better balance.
Mattis
7 Posts
July 1, 2024, 1:49 pmQuote from Mattis on July 1, 2024, 1:49 pmThe Balancen is aktiv but I have an osd with 95% and one with 80 % on the same node bothh are ssd with the same space.
The Balancen is aktiv but I have an osd with 95% and one with 80 % on the same node bothh are ssd with the same space.
admin
2,930 Posts
July 1, 2024, 2:17 pmQuote from admin on July 1, 2024, 2:17 pmCould you post
ceph osd df
ceph df
For a temp quick fix, from ui reduce the crush weight of osd by 10%
Is the pg autoscaler on, is it on the pools ?
Could you post
ceph osd df
ceph df
For a temp quick fix, from ui reduce the crush weight of osd by 10%
Is the pg autoscaler on, is it on the pools ?
Last edited on July 1, 2024, 2:17 pm by admin · #4
Mattis
7 Posts
July 1, 2024, 2:55 pmQuote from Mattis on July 1, 2024, 2:55 pmroot@sanc1:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
5 ssd 6.98630 1.00000 7.0 TiB 5.9 TiB 5.8 TiB 23 GiB 56 GiB 1.1 TiB 83.98 0.96 177 up
6 ssd 6.98630 1.00000 7.0 TiB 5.6 TiB 5.5 TiB 24 GiB 56 GiB 1.4 TiB 79.95 0.92 177 up
7 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 54 GiB 590 GiB 91.75 1.05 178 up
8 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 57 GiB 610 GiB 91.48 1.05 180 up
9 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 25 GiB 55 GiB 747 GiB 89.56 1.03 185 up
10 ssd 6.98630 1.00000 7.0 TiB 5.7 TiB 5.6 TiB 24 GiB 58 GiB 1.3 TiB 81.83 0.94 177 up
11 ssd 6.98630 1.00000 7.0 TiB 5.8 TiB 5.7 TiB 23 GiB 58 GiB 1.2 TiB 82.41 0.94 178 up
12 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 24 GiB 59 GiB 735 GiB 89.73 1.03 189 up
13 ssd 6.98630 1.00000 7.0 TiB 6.6 TiB 6.5 TiB 22 GiB 54 GiB 431 GiB 93.98 1.08 175 up
14 ssd 6.98630 1.00000 7.0 TiB 6.2 TiB 6.1 TiB 24 GiB 54 GiB 800 GiB 88.81 1.02 178 up
0 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 58 GiB 616 GiB 91.39 1.05 185 up
1 ssd 6.98630 1.00000 7.0 TiB 6.6 TiB 6.5 TiB 24 GiB 57 GiB 388 GiB 94.58 1.08 182 up
2 ssd 6.98630 1.00000 7.0 TiB 5.2 TiB 5.1 TiB 25 GiB 53 GiB 1.8 TiB 74.56 0.85 175 up
3 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 23 GiB 58 GiB 729 GiB 89.81 1.03 181 up
4 ssd 6.98630 1.00000 7.0 TiB 6.0 TiB 6.0 TiB 22 GiB 57 GiB 973 GiB 86.40 0.99 174 up
TOTAL 105 TiB 92 TiB 90 TiB 352 GiB 843 GiB 13 TiB 87.35
MIN/MAX VAR: 0.85/1.08 STDDEV: 5.50
root@sanc1:~# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 105 TiB 13 TiB 92 TiB 92 TiB 87.35
TOTAL 105 TiB 13 TiB 92 TiB 92 TiB 87.35
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 1 1 17 MiB 6 51 MiB 0.01 150 GiB
rbd 2 128 21 TiB 5.47M 62 TiB 99.30 150 GiB
nfs-pool 3 256 9.0 TiB 192.48M 28 TiB 98.48 150 GiB
nfs-metadata 4 512 118 GiB 5.53M 354 GiB 44.07 150 GiB
root@sanc1:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
5 ssd 6.98630 1.00000 7.0 TiB 5.9 TiB 5.8 TiB 23 GiB 56 GiB 1.1 TiB 83.98 0.96 177 up
6 ssd 6.98630 1.00000 7.0 TiB 5.6 TiB 5.5 TiB 24 GiB 56 GiB 1.4 TiB 79.95 0.92 177 up
7 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 54 GiB 590 GiB 91.75 1.05 178 up
8 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 57 GiB 610 GiB 91.48 1.05 180 up
9 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 25 GiB 55 GiB 747 GiB 89.56 1.03 185 up
10 ssd 6.98630 1.00000 7.0 TiB 5.7 TiB 5.6 TiB 24 GiB 58 GiB 1.3 TiB 81.83 0.94 177 up
11 ssd 6.98630 1.00000 7.0 TiB 5.8 TiB 5.7 TiB 23 GiB 58 GiB 1.2 TiB 82.41 0.94 178 up
12 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 24 GiB 59 GiB 735 GiB 89.73 1.03 189 up
13 ssd 6.98630 1.00000 7.0 TiB 6.6 TiB 6.5 TiB 22 GiB 54 GiB 431 GiB 93.98 1.08 175 up
14 ssd 6.98630 1.00000 7.0 TiB 6.2 TiB 6.1 TiB 24 GiB 54 GiB 800 GiB 88.81 1.02 178 up
0 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 58 GiB 616 GiB 91.39 1.05 185 up
1 ssd 6.98630 1.00000 7.0 TiB 6.6 TiB 6.5 TiB 24 GiB 57 GiB 388 GiB 94.58 1.08 182 up
2 ssd 6.98630 1.00000 7.0 TiB 5.2 TiB 5.1 TiB 25 GiB 53 GiB 1.8 TiB 74.56 0.85 175 up
3 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 23 GiB 58 GiB 729 GiB 89.81 1.03 181 up
4 ssd 6.98630 1.00000 7.0 TiB 6.0 TiB 6.0 TiB 22 GiB 57 GiB 973 GiB 86.40 0.99 174 up
TOTAL 105 TiB 92 TiB 90 TiB 352 GiB 843 GiB 13 TiB 87.35
MIN/MAX VAR: 0.85/1.08 STDDEV: 5.50
root@sanc1:~# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 105 TiB 13 TiB 92 TiB 92 TiB 87.35
TOTAL 105 TiB 13 TiB 92 TiB 92 TiB 87.35
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 1 1 17 MiB 6 51 MiB 0.01 150 GiB
rbd 2 128 21 TiB 5.47M 62 TiB 99.30 150 GiB
nfs-pool 3 256 9.0 TiB 192.48M 28 TiB 98.48 150 GiB
nfs-metadata 4 512 118 GiB 5.53M 354 GiB 44.07 150 GiB
admin
2,930 Posts
July 1, 2024, 3:34 pmQuote from admin on July 1, 2024, 3:34 pmThe number of pgs per osd is good, their variation among the osds is not so bad, but even osds with same pg count have a usage variance, probably a factor of random size variation among the pgs. No need to change the pg autoscaler.
The quickest fix now is to manually lower the crush weight from ui of the top used OSDs by say 7%.
You can also experiment with different balancer modes: crush vs upmap.
It is possible the balancer is not efficient if some OSDs fall under the default crush rule as well as ssd class rule. In this case it may make sense to only use ssd class but this needs to be investigated first as doing it directly will cause rebalance traffic, google to see how to do this efficiently.
In general your usage is over 85%, it could make sense to add more storage sepcially if you are writing new data.
Lastly we have colored usage warning on the dashboard page, it is important to check this regularly and not leave it till the disks gets filled, you can also specify to get notification emails from the system
The number of pgs per osd is good, their variation among the osds is not so bad, but even osds with same pg count have a usage variance, probably a factor of random size variation among the pgs. No need to change the pg autoscaler.
The quickest fix now is to manually lower the crush weight from ui of the top used OSDs by say 7%.
You can also experiment with different balancer modes: crush vs upmap.
It is possible the balancer is not efficient if some OSDs fall under the default crush rule as well as ssd class rule. In this case it may make sense to only use ssd class but this needs to be investigated first as doing it directly will cause rebalance traffic, google to see how to do this efficiently.
In general your usage is over 85%, it could make sense to add more storage sepcially if you are writing new data.
Lastly we have colored usage warning on the dashboard page, it is important to check this regularly and not leave it till the disks gets filled, you can also specify to get notification emails from the system
Pool have low avaible Space
Mattis
7 Posts
Quote from Mattis on July 1, 2024, 9:19 amWe have a PETASAN Cluster with 14 TB Available but our Pools only have 50 GB Available Space.
We have and NFS and some iscsi Disk one our SAN.
We have a PETASAN Cluster with 14 TB Available but our Pools only have 50 GB Available Space.
We have and NFS and some iscsi Disk one our SAN.
admin
2,930 Posts
Quote from admin on July 1, 2024, 11:20 amThis could happen if you have inbalance among your OSDs, the pool available size will be limited by its most filled OSD. Look at the dashboard chart for top OSD filled %, also running the command
ceph osd df
will show you variations among OSDs. You should enable the ceph balancer from UI, pg autoscaler will also help to make sure you have enough pgs for balancing. For short term fixes you can also change the osd crush weights also from UI to get better balance.
This could happen if you have inbalance among your OSDs, the pool available size will be limited by its most filled OSD. Look at the dashboard chart for top OSD filled %, also running the command
ceph osd df
will show you variations among OSDs. You should enable the ceph balancer from UI, pg autoscaler will also help to make sure you have enough pgs for balancing. For short term fixes you can also change the osd crush weights also from UI to get better balance.
Mattis
7 Posts
Quote from Mattis on July 1, 2024, 1:49 pmThe Balancen is aktiv but I have an osd with 95% and one with 80 % on the same node bothh are ssd with the same space.
The Balancen is aktiv but I have an osd with 95% and one with 80 % on the same node bothh are ssd with the same space.
admin
2,930 Posts
Quote from admin on July 1, 2024, 2:17 pmCould you post
ceph osd df
ceph df
For a temp quick fix, from ui reduce the crush weight of osd by 10%
Is the pg autoscaler on, is it on the pools ?
Could you post
ceph osd df
ceph df
For a temp quick fix, from ui reduce the crush weight of osd by 10%
Is the pg autoscaler on, is it on the pools ?
Mattis
7 Posts
Quote from Mattis on July 1, 2024, 2:55 pmroot@sanc1:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
5 ssd 6.98630 1.00000 7.0 TiB 5.9 TiB 5.8 TiB 23 GiB 56 GiB 1.1 TiB 83.98 0.96 177 up
6 ssd 6.98630 1.00000 7.0 TiB 5.6 TiB 5.5 TiB 24 GiB 56 GiB 1.4 TiB 79.95 0.92 177 up
7 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 54 GiB 590 GiB 91.75 1.05 178 up
8 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 57 GiB 610 GiB 91.48 1.05 180 up
9 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 25 GiB 55 GiB 747 GiB 89.56 1.03 185 up
10 ssd 6.98630 1.00000 7.0 TiB 5.7 TiB 5.6 TiB 24 GiB 58 GiB 1.3 TiB 81.83 0.94 177 up
11 ssd 6.98630 1.00000 7.0 TiB 5.8 TiB 5.7 TiB 23 GiB 58 GiB 1.2 TiB 82.41 0.94 178 up
12 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 24 GiB 59 GiB 735 GiB 89.73 1.03 189 up
13 ssd 6.98630 1.00000 7.0 TiB 6.6 TiB 6.5 TiB 22 GiB 54 GiB 431 GiB 93.98 1.08 175 up
14 ssd 6.98630 1.00000 7.0 TiB 6.2 TiB 6.1 TiB 24 GiB 54 GiB 800 GiB 88.81 1.02 178 up
0 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 58 GiB 616 GiB 91.39 1.05 185 up
1 ssd 6.98630 1.00000 7.0 TiB 6.6 TiB 6.5 TiB 24 GiB 57 GiB 388 GiB 94.58 1.08 182 up
2 ssd 6.98630 1.00000 7.0 TiB 5.2 TiB 5.1 TiB 25 GiB 53 GiB 1.8 TiB 74.56 0.85 175 up
3 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 23 GiB 58 GiB 729 GiB 89.81 1.03 181 up
4 ssd 6.98630 1.00000 7.0 TiB 6.0 TiB 6.0 TiB 22 GiB 57 GiB 973 GiB 86.40 0.99 174 up
TOTAL 105 TiB 92 TiB 90 TiB 352 GiB 843 GiB 13 TiB 87.35
MIN/MAX VAR: 0.85/1.08 STDDEV: 5.50
root@sanc1:~# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 105 TiB 13 TiB 92 TiB 92 TiB 87.35
TOTAL 105 TiB 13 TiB 92 TiB 92 TiB 87.35--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 1 1 17 MiB 6 51 MiB 0.01 150 GiB
rbd 2 128 21 TiB 5.47M 62 TiB 99.30 150 GiB
nfs-pool 3 256 9.0 TiB 192.48M 28 TiB 98.48 150 GiB
nfs-metadata 4 512 118 GiB 5.53M 354 GiB 44.07 150 GiB
root@sanc1:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
5 ssd 6.98630 1.00000 7.0 TiB 5.9 TiB 5.8 TiB 23 GiB 56 GiB 1.1 TiB 83.98 0.96 177 up
6 ssd 6.98630 1.00000 7.0 TiB 5.6 TiB 5.5 TiB 24 GiB 56 GiB 1.4 TiB 79.95 0.92 177 up
7 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 54 GiB 590 GiB 91.75 1.05 178 up
8 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 57 GiB 610 GiB 91.48 1.05 180 up
9 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 25 GiB 55 GiB 747 GiB 89.56 1.03 185 up
10 ssd 6.98630 1.00000 7.0 TiB 5.7 TiB 5.6 TiB 24 GiB 58 GiB 1.3 TiB 81.83 0.94 177 up
11 ssd 6.98630 1.00000 7.0 TiB 5.8 TiB 5.7 TiB 23 GiB 58 GiB 1.2 TiB 82.41 0.94 178 up
12 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 24 GiB 59 GiB 735 GiB 89.73 1.03 189 up
13 ssd 6.98630 1.00000 7.0 TiB 6.6 TiB 6.5 TiB 22 GiB 54 GiB 431 GiB 93.98 1.08 175 up
14 ssd 6.98630 1.00000 7.0 TiB 6.2 TiB 6.1 TiB 24 GiB 54 GiB 800 GiB 88.81 1.02 178 up
0 ssd 6.98630 1.00000 7.0 TiB 6.4 TiB 6.3 TiB 23 GiB 58 GiB 616 GiB 91.39 1.05 185 up
1 ssd 6.98630 1.00000 7.0 TiB 6.6 TiB 6.5 TiB 24 GiB 57 GiB 388 GiB 94.58 1.08 182 up
2 ssd 6.98630 1.00000 7.0 TiB 5.2 TiB 5.1 TiB 25 GiB 53 GiB 1.8 TiB 74.56 0.85 175 up
3 ssd 6.98630 1.00000 7.0 TiB 6.3 TiB 6.2 TiB 23 GiB 58 GiB 729 GiB 89.81 1.03 181 up
4 ssd 6.98630 1.00000 7.0 TiB 6.0 TiB 6.0 TiB 22 GiB 57 GiB 973 GiB 86.40 0.99 174 up
TOTAL 105 TiB 92 TiB 90 TiB 352 GiB 843 GiB 13 TiB 87.35
MIN/MAX VAR: 0.85/1.08 STDDEV: 5.50
root@sanc1:~# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 105 TiB 13 TiB 92 TiB 92 TiB 87.35
TOTAL 105 TiB 13 TiB 92 TiB 92 TiB 87.35
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 1 1 17 MiB 6 51 MiB 0.01 150 GiB
rbd 2 128 21 TiB 5.47M 62 TiB 99.30 150 GiB
nfs-pool 3 256 9.0 TiB 192.48M 28 TiB 98.48 150 GiB
nfs-metadata 4 512 118 GiB 5.53M 354 GiB 44.07 150 GiB
admin
2,930 Posts
Quote from admin on July 1, 2024, 3:34 pmThe number of pgs per osd is good, their variation among the osds is not so bad, but even osds with same pg count have a usage variance, probably a factor of random size variation among the pgs. No need to change the pg autoscaler.
The quickest fix now is to manually lower the crush weight from ui of the top used OSDs by say 7%.
You can also experiment with different balancer modes: crush vs upmap.
It is possible the balancer is not efficient if some OSDs fall under the default crush rule as well as ssd class rule. In this case it may make sense to only use ssd class but this needs to be investigated first as doing it directly will cause rebalance traffic, google to see how to do this efficiently.
In general your usage is over 85%, it could make sense to add more storage sepcially if you are writing new data.
Lastly we have colored usage warning on the dashboard page, it is important to check this regularly and not leave it till the disks gets filled, you can also specify to get notification emails from the system
The number of pgs per osd is good, their variation among the osds is not so bad, but even osds with same pg count have a usage variance, probably a factor of random size variation among the pgs. No need to change the pg autoscaler.
The quickest fix now is to manually lower the crush weight from ui of the top used OSDs by say 7%.
You can also experiment with different balancer modes: crush vs upmap.
It is possible the balancer is not efficient if some OSDs fall under the default crush rule as well as ssd class rule. In this case it may make sense to only use ssd class but this needs to be investigated first as doing it directly will cause rebalance traffic, google to see how to do this efficiently.
In general your usage is over 85%, it could make sense to add more storage sepcially if you are writing new data.
Lastly we have colored usage warning on the dashboard page, it is important to check this regularly and not leave it till the disks gets filled, you can also specify to get notification emails from the system