OSD balancer and pool size issue
Pages: 1 2
killerodin
33 Posts
July 12, 2022, 11:39 amQuote from killerodin on July 12, 2022, 11:39 amHello together
I have two problems, I hope it's ok I put it together. Maybe it's related to each other.
First a overview of all:
We're running on version 3.0.1, have 3 Nodes with 11 SSD OSD's per Node and 4 Nodes with 2 HDD OSD's per Node.
Then we have three Pools
- device_health_metrics - this pool was created automaticly
- SSD-Pool, 900 PG's (we have started with 9 OSD's)
- HDD-Pool, 1024 PG's
Output of ceph df detail
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 131 TiB 71 TiB 61 TiB 61 TiB 46.05
ssd 115 TiB 34 TiB 81 TiB 81 TiB 70.22
TOTAL 247 TiB 105 TiB 141 TiB 141 TiB 57.34
--- POOLS ---
POOL ID PGS STORED (DATA) (OMAP) OBJECTS USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR
SSD-Pool 7 900 27 TiB 27 TiB 7.8 MiB 8.23M 81 TiB 81 TiB 23 MiB 89.29 3.2 TiB N/A N/A 8.23M 0 B 0 B
HDD-Pool 8 1024 20 TiB 20 TiB 2.8 MiB 5.27M 60 TiB 60 TiB 8.4 MiB 49.13 21 TiB N/A N/A 5.27M 0 B 0 B
device_health_metrics 9 1 24 MiB 0 B 24 MiB 46 72 MiB 0 B 72 MiB 0 6.9 TiB N/A N/A 46 0 B 0 B
In my opinion the SSD pool should have 125 TiB Size (33 x 3.8 TiB) but the last resize when we add 3 OSD's to the Nodes the pool doesn't resize automaticly (until now it was always like this)
Then we have a balancer active
Output of ceph balancer status
{
"active": true,
"last_optimize_duration": "0:00:00.036000",
"last_optimize_started": "Tue Jul 12 13:20:47 2022",
"mode": "crush-compat",
"optimize_result": "Some osds belong to multiple subtrees: {0: ['SSD-Pool', 'default'], 1: ['SSD-Pool', 'default'], 2: ['SSD-Pool', 'default'], 3: ['SSD-Pool', 'default'], 4: ['SSD-Pool', 'default'], 5: ['SSD-Pool', 'default'], 6: ['SSD-Pool', 'default'], 7: ['SSD-Pool', 'default'], 8: ['SSD-Pool', 'default'], 9: ['SSD-Pool', 'default'], 10: ['SSD-Pool', 'default'], 11: ['SSD-Pool', 'default'], 12: ['SSD-Pool', 'default'], 13: ['SSD-Pool', 'default'], 14: ['SSD-Pool', 'default'], 15: ['HDD-Pool', 'default'], 16: ['HDD-Pool', 'default'], 17: ['HDD-Pool', 'default'], 18: ['HDD-Pool', 'default'], 19: ['HDD-Pool', 'default'], 20: ['HDD-Pool', 'default'], 21: ['HDD-Pool', 'default'], 22: ['HDD-Pool', 'default'], 23: ['SSD-Pool', 'default'], 24: ['SSD-Pool', 'default'], 25: ['SSD-Pool', 'default'], 26: ['SSD-Pool', 'default'], 27: ['SSD-Pool', 'default'], 28: ['SSD-Pool', 'default'], 29: ['SSD-Pool', 'default'], 30: ['SSD-Pool', 'default'], 31: ['SSD-Pool', 'default'], 32: ['SSD-Pool', 'default'], 33: ['SSD-Pool', 'default'], 34: ['SSD-Pool', 'default'], 35: ['SSD-Pool', 'default'], 36: ['SSD-Pool', 'default'], 37: ['SSD-Pool', 'default'], 38: ['SSD-Pool', 'default'], 39: ['SSD-Pool', 'default'], 40: ['SSD-Pool', 'default']}",
"plans": []
}
But the output of ceph osd df shows a unbalance and I don't know why
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
15 hdd 16.42969 1.00000 16 TiB 7.2 TiB 7.2 TiB 3.9 MiB 11 GiB 9.2 TiB 44.02 0.77 367 up
16 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.6 TiB 46 KiB 12 GiB 8.7 TiB 46.87 0.82 391 up
17 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 3.8 MiB 12 GiB 8.6 TiB 47.69 0.83 398 up
18 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 459 KiB 11 GiB 8.7 TiB 47.13 0.82 393 up
19 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.6 TiB 166 KiB 11 GiB 8.7 TiB 46.85 0.82 391 up
20 hdd 16.42969 1.00000 16 TiB 7.4 TiB 7.3 TiB 212 KiB 11 GiB 9.1 TiB 44.75 0.78 373 up
21 hdd 16.42969 1.00000 16 TiB 7.6 TiB 7.5 TiB 133 KiB 11 GiB 8.8 TiB 46.21 0.81 385 up
22 hdd 16.42969 1.00000 16 TiB 7.4 TiB 7.3 TiB 23 MiB 11 GiB 9.1 TiB 44.86 0.78 375 up
0 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 4.9 MiB 6.9 GiB 1.2 TiB 64.97 1.13 77 up
1 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.8 GiB 1.0 TiB 70.07 1.22 84 up
2 ssd 3.49300 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 342 KiB 6.7 GiB 1.3 TiB 64.09 1.12 78 up
9 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 3.8 MiB 6.8 GiB 1.1 TiB 69.40 1.21 81 up
12 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 0 B 7.0 GiB 801 GiB 77.62 1.35 84 up
23 ssd 3.49309 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.6 MiB 5.8 GiB 1.3 TiB 64.09 1.12 73 up
26 ssd 3.49309 1.00000 3.5 TiB 3.0 TiB 3.0 TiB 84 KiB 5.5 GiB 479 GiB 86.60 1.51 105 up
28 ssd 3.49199 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 3.3 MiB 5.8 GiB 1.3 TiB 64.07 1.12 74 up
29 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.6 MiB 6.4 GiB 1.1 TiB 67.95 1.18 80 up
30 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.1 GiB 1.1 TiB 69.47 1.21 83 up
37 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 2.6 MiB 6.5 GiB 926 GiB 74.11 1.29 82 up
3 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 959 GiB 73.20 1.28 82 up
4 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 3.8 MiB 6.9 GiB 1.1 TiB 67.09 1.17 79 up
5 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 25 MiB 7.7 GiB 987 GiB 72.40 1.26 85 up
10 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 0 B 7.1 GiB 854 GiB 76.12 1.33 88 up
13 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 5.6 MiB 6.6 GiB 1.1 TiB 69.29 1.21 82 up
24 ssd 3.49309 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.9 MiB 5.5 GiB 1.3 TiB 62.71 1.09 74 up
27 ssd 3.49309 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 4.8 MiB 4.2 GiB 1.2 TiB 64.88 1.13 77 up
31 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.5 MiB 6.3 GiB 1.0 TiB 70.14 1.22 83 up
32 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.5 TiB 2.9 MiB 6.6 GiB 959 GiB 73.17 1.28 84 up
33 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 1.6 MiB 6.4 GiB 1.0 TiB 70.97 1.24 84 up
38 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.7 MiB 6.4 GiB 983 GiB 72.50 1.26 83 up
6 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 1.2 MiB 7.1 GiB 855 GiB 76.11 1.33 88 up
7 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.9 MiB 7.6 GiB 1.0 TiB 71.08 1.24 80 up
8 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.8 GiB 1.1 TiB 69.38 1.21 81 up
11 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.5 GiB 1.1 TiB 67.79 1.18 80 up
14 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 2.2 MiB 6.7 GiB 1.2 TiB 64.90 1.13 73 up
25 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.6 MiB 5.6 GiB 1.0 TiB 70.92 1.24 86 up
34 ssd 3.49199 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.2 MiB 6.1 GiB 1.3 TiB 63.28 1.10 78 up
35 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 684 KiB 6.1 GiB 1016 GiB 71.59 1.25 89 up
36 ssd 3.49199 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 3.2 MiB 6.1 GiB 1.1 TiB 67.16 1.17 78 up
39 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 0 B 6.6 GiB 1.0 TiB 70.96 1.24 78 up
40 ssd 3.49309 1.00000 3.5 TiB 2.8 TiB 2.8 TiB 2.1 MiB 5.0 GiB 742 GiB 79.25 1.38 89 up
TOTAL 247 TiB 141 TiB 141 TiB 158 MiB 302 GiB 105 TiB 57.34
What can I do, that the balancer works better and why don't resize my pool 'SSD-Pool' automaticly?
Hello together
I have two problems, I hope it's ok I put it together. Maybe it's related to each other.
First a overview of all:
We're running on version 3.0.1, have 3 Nodes with 11 SSD OSD's per Node and 4 Nodes with 2 HDD OSD's per Node.
Then we have three Pools
- device_health_metrics - this pool was created automaticly
- SSD-Pool, 900 PG's (we have started with 9 OSD's)
- HDD-Pool, 1024 PG's
Output of ceph df detail
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 131 TiB 71 TiB 61 TiB 61 TiB 46.05
ssd 115 TiB 34 TiB 81 TiB 81 TiB 70.22
TOTAL 247 TiB 105 TiB 141 TiB 141 TiB 57.34
--- POOLS ---
POOL ID PGS STORED (DATA) (OMAP) OBJECTS USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR
SSD-Pool 7 900 27 TiB 27 TiB 7.8 MiB 8.23M 81 TiB 81 TiB 23 MiB 89.29 3.2 TiB N/A N/A 8.23M 0 B 0 B
HDD-Pool 8 1024 20 TiB 20 TiB 2.8 MiB 5.27M 60 TiB 60 TiB 8.4 MiB 49.13 21 TiB N/A N/A 5.27M 0 B 0 B
device_health_metrics 9 1 24 MiB 0 B 24 MiB 46 72 MiB 0 B 72 MiB 0 6.9 TiB N/A N/A 46 0 B 0 B
In my opinion the SSD pool should have 125 TiB Size (33 x 3.8 TiB) but the last resize when we add 3 OSD's to the Nodes the pool doesn't resize automaticly (until now it was always like this)
Then we have a balancer active
Output of ceph balancer status
{
"active": true,
"last_optimize_duration": "0:00:00.036000",
"last_optimize_started": "Tue Jul 12 13:20:47 2022",
"mode": "crush-compat",
"optimize_result": "Some osds belong to multiple subtrees: {0: ['SSD-Pool', 'default'], 1: ['SSD-Pool', 'default'], 2: ['SSD-Pool', 'default'], 3: ['SSD-Pool', 'default'], 4: ['SSD-Pool', 'default'], 5: ['SSD-Pool', 'default'], 6: ['SSD-Pool', 'default'], 7: ['SSD-Pool', 'default'], 8: ['SSD-Pool', 'default'], 9: ['SSD-Pool', 'default'], 10: ['SSD-Pool', 'default'], 11: ['SSD-Pool', 'default'], 12: ['SSD-Pool', 'default'], 13: ['SSD-Pool', 'default'], 14: ['SSD-Pool', 'default'], 15: ['HDD-Pool', 'default'], 16: ['HDD-Pool', 'default'], 17: ['HDD-Pool', 'default'], 18: ['HDD-Pool', 'default'], 19: ['HDD-Pool', 'default'], 20: ['HDD-Pool', 'default'], 21: ['HDD-Pool', 'default'], 22: ['HDD-Pool', 'default'], 23: ['SSD-Pool', 'default'], 24: ['SSD-Pool', 'default'], 25: ['SSD-Pool', 'default'], 26: ['SSD-Pool', 'default'], 27: ['SSD-Pool', 'default'], 28: ['SSD-Pool', 'default'], 29: ['SSD-Pool', 'default'], 30: ['SSD-Pool', 'default'], 31: ['SSD-Pool', 'default'], 32: ['SSD-Pool', 'default'], 33: ['SSD-Pool', 'default'], 34: ['SSD-Pool', 'default'], 35: ['SSD-Pool', 'default'], 36: ['SSD-Pool', 'default'], 37: ['SSD-Pool', 'default'], 38: ['SSD-Pool', 'default'], 39: ['SSD-Pool', 'default'], 40: ['SSD-Pool', 'default']}",
"plans": []
}
But the output of ceph osd df shows a unbalance and I don't know why
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
15 hdd 16.42969 1.00000 16 TiB 7.2 TiB 7.2 TiB 3.9 MiB 11 GiB 9.2 TiB 44.02 0.77 367 up
16 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.6 TiB 46 KiB 12 GiB 8.7 TiB 46.87 0.82 391 up
17 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 3.8 MiB 12 GiB 8.6 TiB 47.69 0.83 398 up
18 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 459 KiB 11 GiB 8.7 TiB 47.13 0.82 393 up
19 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.6 TiB 166 KiB 11 GiB 8.7 TiB 46.85 0.82 391 up
20 hdd 16.42969 1.00000 16 TiB 7.4 TiB 7.3 TiB 212 KiB 11 GiB 9.1 TiB 44.75 0.78 373 up
21 hdd 16.42969 1.00000 16 TiB 7.6 TiB 7.5 TiB 133 KiB 11 GiB 8.8 TiB 46.21 0.81 385 up
22 hdd 16.42969 1.00000 16 TiB 7.4 TiB 7.3 TiB 23 MiB 11 GiB 9.1 TiB 44.86 0.78 375 up
0 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 4.9 MiB 6.9 GiB 1.2 TiB 64.97 1.13 77 up
1 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.8 GiB 1.0 TiB 70.07 1.22 84 up
2 ssd 3.49300 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 342 KiB 6.7 GiB 1.3 TiB 64.09 1.12 78 up
9 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 3.8 MiB 6.8 GiB 1.1 TiB 69.40 1.21 81 up
12 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 0 B 7.0 GiB 801 GiB 77.62 1.35 84 up
23 ssd 3.49309 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.6 MiB 5.8 GiB 1.3 TiB 64.09 1.12 73 up
26 ssd 3.49309 1.00000 3.5 TiB 3.0 TiB 3.0 TiB 84 KiB 5.5 GiB 479 GiB 86.60 1.51 105 up
28 ssd 3.49199 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 3.3 MiB 5.8 GiB 1.3 TiB 64.07 1.12 74 up
29 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.6 MiB 6.4 GiB 1.1 TiB 67.95 1.18 80 up
30 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.1 GiB 1.1 TiB 69.47 1.21 83 up
37 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 2.6 MiB 6.5 GiB 926 GiB 74.11 1.29 82 up
3 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 959 GiB 73.20 1.28 82 up
4 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 3.8 MiB 6.9 GiB 1.1 TiB 67.09 1.17 79 up
5 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 25 MiB 7.7 GiB 987 GiB 72.40 1.26 85 up
10 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 0 B 7.1 GiB 854 GiB 76.12 1.33 88 up
13 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 5.6 MiB 6.6 GiB 1.1 TiB 69.29 1.21 82 up
24 ssd 3.49309 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.9 MiB 5.5 GiB 1.3 TiB 62.71 1.09 74 up
27 ssd 3.49309 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 4.8 MiB 4.2 GiB 1.2 TiB 64.88 1.13 77 up
31 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.5 MiB 6.3 GiB 1.0 TiB 70.14 1.22 83 up
32 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.5 TiB 2.9 MiB 6.6 GiB 959 GiB 73.17 1.28 84 up
33 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 1.6 MiB 6.4 GiB 1.0 TiB 70.97 1.24 84 up
38 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.7 MiB 6.4 GiB 983 GiB 72.50 1.26 83 up
6 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 1.2 MiB 7.1 GiB 855 GiB 76.11 1.33 88 up
7 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.9 MiB 7.6 GiB 1.0 TiB 71.08 1.24 80 up
8 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.8 GiB 1.1 TiB 69.38 1.21 81 up
11 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.5 GiB 1.1 TiB 67.79 1.18 80 up
14 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 2.2 MiB 6.7 GiB 1.2 TiB 64.90 1.13 73 up
25 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.6 MiB 5.6 GiB 1.0 TiB 70.92 1.24 86 up
34 ssd 3.49199 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.2 MiB 6.1 GiB 1.3 TiB 63.28 1.10 78 up
35 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 684 KiB 6.1 GiB 1016 GiB 71.59 1.25 89 up
36 ssd 3.49199 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 3.2 MiB 6.1 GiB 1.1 TiB 67.16 1.17 78 up
39 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 0 B 6.6 GiB 1.0 TiB 70.96 1.24 78 up
40 ssd 3.49309 1.00000 3.5 TiB 2.8 TiB 2.8 TiB 2.1 MiB 5.0 GiB 742 GiB 79.25 1.38 89 up
TOTAL 247 TiB 141 TiB 141 TiB 158 MiB 302 GiB 105 TiB 57.34
What can I do, that the balancer works better and why don't resize my pool 'SSD-Pool' automaticly?
admin
2,930 Posts
July 12, 2022, 10:17 pmQuote from admin on July 12, 2022, 10:17 pmCan you try changing balancer mode to upmap and see if it solves the balacing issue.
Yes the remaining capacity is affected by the balancing not working correctly.
Can you try changing balancer mode to upmap and see if it solves the balacing issue.
Yes the remaining capacity is affected by the balancing not working correctly.
killerodin
33 Posts
July 13, 2022, 6:45 amQuote from killerodin on July 13, 2022, 6:45 amI tried over web but I got an error:
Error, min_compat_client "luminous" is required for pg-upmap.
Can I change this on the productive system during operation? And how can I change this? I am never sure what I am allowed to change and what has consequences
I tried over web but I got an error:
Error, min_compat_client "luminous" is required for pg-upmap.
Can I change this on the productive system during operation? And how can I change this? I am never sure what I am allowed to change and what has consequences
killerodin
33 Posts
July 13, 2022, 9:45 amQuote from killerodin on July 13, 2022, 9:45 amEdit:
Output of ceph versions is
{
"mon": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"mgr": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"osd": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 41
},
"mds": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"overall": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 50
}
}
Edit:
Output of ceph versions is
{
"mon": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"mgr": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"osd": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 41
},
"mds": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"overall": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 50
}
}
admin
2,930 Posts
July 13, 2022, 10:05 amQuote from admin on July 13, 2022, 10:05 amthe balancer will not allow you to use pgupmap if you have pre-luminous clients
run
ceph features
to see if you have clients which require pre- luminous features
Another way is to keep the balancer as is and delete the default replicated rule, make sure your pools do not use this rule, you can delete the device health metrics pool if using it and if you wish to recreate it with ssd rule.
the balancer will not allow you to use pgupmap if you have pre-luminous clients
run
ceph features
to see if you have clients which require pre- luminous features
Another way is to keep the balancer as is and delete the default replicated rule, make sure your pools do not use this rule, you can delete the device health metrics pool if using it and if you wish to recreate it with ssd rule.
killerodin
33 Posts
July 13, 2022, 12:03 pmQuote from killerodin on July 13, 2022, 12:03 pmceph features shows
"mon":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
"mds":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
"osd":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 41
"client":
"features": "0x2f018fb86aa42ada",
"release": "luminous",
"num": 3
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
"mgr":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
So is the highlighted client the problem?
For what exactly is the device_health_metrics pool? Did we need this and what happend when I delete it? Only this pool uses the default replicaed_rule.
Sorry that I ask several times but I prefer to be on the safe side and don't want to do things I don't understand
ceph features shows
"mon":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
"mds":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
"osd":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 41
"client":
"features": "0x2f018fb86aa42ada",
"release": "luminous",
"num": 3
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
"mgr":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
So is the highlighted client the problem?
For what exactly is the device_health_metrics pool? Did we need this and what happend when I delete it? Only this pool uses the default replicaed_rule.
Sorry that I ask several times but I prefer to be on the safe side and don't want to do things I don't understand
admin
2,930 Posts
July 13, 2022, 1:27 pmQuote from admin on July 13, 2022, 1:27 pmThe client is ok, it is requesting luminous features. but not sure why you are getting the pre-luminous error when tsrying to set upmap.
you can change the device health pool to use the ssd rule from the ui, you can then delete the default rule:
ceph osd crush rule rm replicated_rule
The client is ok, it is requesting luminous features. but not sure why you are getting the pre-luminous error when tsrying to set upmap.
you can change the device health pool to use the ssd rule from the ui, you can then delete the default rule:
ceph osd crush rule rm replicated_rule
killerodin
33 Posts
July 13, 2022, 2:02 pmQuote from killerodin on July 13, 2022, 2:02 pmHimm... strange
I've change the rule from the device_health_pool and delete the default rule. But I still can't change the balancer mode.
Himm... strange
I've change the rule from the device_health_pool and delete the default rule. But I still can't change the balancer mode.
admin
2,930 Posts
July 13, 2022, 6:37 pmQuote from admin on July 13, 2022, 6:37 pmyou should not have to change the mode, the deletion of the default rule should make the balancer work with existing mode, look for the balancer status should no longer have the "Some osds belong to multiple subtrees"
you should not have to change the mode, the deletion of the default rule should make the balancer work with existing mode, look for the balancer status should no longer have the "Some osds belong to multiple subtrees"
killerodin
33 Posts
July 14, 2022, 7:23 amQuote from killerodin on July 14, 2022, 7:23 amHey
ceph balancer status says
{
"active": true,
"last_optimize_duration": "0:00:01.502575",
"last_optimize_started": "Thu Jul 14 08:12:52 2022",
"mode": "crush-compat",
"optimize_result": "Unable to find further optimization, change balancer mode and retry might help",
"plans": []
}
and ceph osd df looks quite better, thanks a lot
root@KXCHMUEST020:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
15 hdd 16.42969 1.00000 16 TiB 7.6 TiB 7.5 TiB 978 KiB 11 GiB 8.8 TiB 46.29 0.80 376 up
16 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 2.3 MiB 12 GiB 8.6 TiB 47.75 0.82 388 up
17 hdd 16.42969 1.00000 16 TiB 7.9 TiB 7.8 TiB 3.9 MiB 12 GiB 8.6 TiB 47.86 0.83 389 up
18 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 426 KiB 12 GiB 8.6 TiB 47.66 0.82 387 up
19 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.7 TiB 3.0 MiB 12 GiB 8.6 TiB 47.49 0.82 386 up
20 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 3.8 MiB 12 GiB 8.7 TiB 47.06 0.81 382 up
21 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 1.2 MiB 12 GiB 8.7 TiB 47.07 0.81 382 up
22 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 1.9 MiB 12 GiB 8.7 TiB 47.10 0.81 382 up
0 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.9 MiB 7.0 GiB 1.1 TiB 68.02 1.17 82 up
1 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 26 MiB 6.9 GiB 1.1 TiB 68.58 1.18 82 up
2 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 342 KiB 6.9 GiB 1.1 TiB 67.09 1.16 82 up
9 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 980 KiB 6.9 GiB 1.1 TiB 69.38 1.20 82 up
12 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 881 GiB 75.38 1.30 82 up
23 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.6 MiB 6.0 GiB 988 GiB 72.38 1.25 82 up
26 ssd 3.49309 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 5.4 MiB 5.4 GiB 1.2 TiB 67.07 1.16 82 up
28 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 37 KiB 6.4 GiB 987 GiB 72.41 1.25 82 up
29 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.6 MiB 6.2 GiB 1.1 TiB 69.45 1.20 82 up
30 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.6 GiB 1.1 TiB 69.46 1.20 82 up
37 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 4.1 MiB 6.4 GiB 952 GiB 73.38 1.26 81 up
3 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 957 GiB 73.24 1.26 81 up
4 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 3.8 MiB 7.0 GiB 1.1 TiB 69.34 1.20 82 up
5 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.9 GiB 1.0 TiB 70.12 1.21 82 up
10 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 0 B 6.9 GiB 1016 GiB 71.58 1.23 82 up
13 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.8 GiB 1.1 TiB 68.57 1.18 81 up
24 ssd 3.49309 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.9 MiB 6.1 GiB 1.1 TiB 69.50 1.20 82 up
27 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.4 TiB 4.6 MiB 4.7 GiB 1.0 TiB 70.15 1.21 83 up
31 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.5 MiB 6.6 GiB 1.1 TiB 69.37 1.20 82 up
32 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.9 MiB 6.5 GiB 1.0 TiB 70.95 1.22 82 up
33 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.3 GiB 1.1 TiB 68.73 1.18 82 up
38 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.6 MiB 6.5 GiB 1.0 TiB 71.03 1.22 82 up
6 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 1.2 MiB 7.1 GiB 1.0 TiB 70.83 1.22 82 up
7 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 29 MiB 7.5 GiB 1.0 TiB 71.12 1.23 82 up
8 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.7 GiB 1.1 TiB 68.62 1.18 81 up
11 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.6 GiB 1.1 TiB 69.33 1.20 82 up
14 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.2 MiB 7.1 GiB 986 GiB 72.45 1.25 82 up
25 ssd 3.49309 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 2.0 MiB 6.1 GiB 1.1 TiB 67.93 1.17 82 up
34 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.1 MiB 6.2 GiB 1.1 TiB 67.82 1.17 82 up
35 ssd 3.49199 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 798 KiB 6.0 GiB 1.2 TiB 65.57 1.13 82 up
36 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.4 TiB 3.1 MiB 6.2 GiB 1.0 TiB 70.19 1.21 82 up
39 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 6.7 GiB 903 GiB 74.76 1.29 82 up
40 ssd 3.49309 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 2.1 MiB 5.0 GiB 932 GiB 73.95 1.27 82 up
TOTAL 247 TiB 143 TiB 142 TiB 188 MiB 308 GiB 104 TiB 58.01
MIN/MAX VAR: 0.80/1.30 STDDEV: 12.12
So the problem was, that two different rules was set on the OSD's, right?
Can I ask you another question?
We have also problems with not deep-scrubbed pg's and you said here (http://www.petasan.org/forums/?view=thread&id=702) that it is possible to set some options in the ceph_configuration (osd_scrub_begin_hour, osd_scrub_end_hour, osd_scrub_sleep, osd_scrub_load_threshold). If I change the settings there are interrupptions in the productive system, right?
Hey
ceph balancer status says
{
"active": true,
"last_optimize_duration": "0:00:01.502575",
"last_optimize_started": "Thu Jul 14 08:12:52 2022",
"mode": "crush-compat",
"optimize_result": "Unable to find further optimization, change balancer mode and retry might help",
"plans": []
}
and ceph osd df looks quite better, thanks a lot
root@KXCHMUEST020:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
15 hdd 16.42969 1.00000 16 TiB 7.6 TiB 7.5 TiB 978 KiB 11 GiB 8.8 TiB 46.29 0.80 376 up
16 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 2.3 MiB 12 GiB 8.6 TiB 47.75 0.82 388 up
17 hdd 16.42969 1.00000 16 TiB 7.9 TiB 7.8 TiB 3.9 MiB 12 GiB 8.6 TiB 47.86 0.83 389 up
18 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 426 KiB 12 GiB 8.6 TiB 47.66 0.82 387 up
19 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.7 TiB 3.0 MiB 12 GiB 8.6 TiB 47.49 0.82 386 up
20 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 3.8 MiB 12 GiB 8.7 TiB 47.06 0.81 382 up
21 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 1.2 MiB 12 GiB 8.7 TiB 47.07 0.81 382 up
22 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 1.9 MiB 12 GiB 8.7 TiB 47.10 0.81 382 up
0 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.9 MiB 7.0 GiB 1.1 TiB 68.02 1.17 82 up
1 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 26 MiB 6.9 GiB 1.1 TiB 68.58 1.18 82 up
2 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 342 KiB 6.9 GiB 1.1 TiB 67.09 1.16 82 up
9 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 980 KiB 6.9 GiB 1.1 TiB 69.38 1.20 82 up
12 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 881 GiB 75.38 1.30 82 up
23 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.6 MiB 6.0 GiB 988 GiB 72.38 1.25 82 up
26 ssd 3.49309 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 5.4 MiB 5.4 GiB 1.2 TiB 67.07 1.16 82 up
28 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 37 KiB 6.4 GiB 987 GiB 72.41 1.25 82 up
29 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.6 MiB 6.2 GiB 1.1 TiB 69.45 1.20 82 up
30 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.6 GiB 1.1 TiB 69.46 1.20 82 up
37 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 4.1 MiB 6.4 GiB 952 GiB 73.38 1.26 81 up
3 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 957 GiB 73.24 1.26 81 up
4 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 3.8 MiB 7.0 GiB 1.1 TiB 69.34 1.20 82 up
5 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.9 GiB 1.0 TiB 70.12 1.21 82 up
10 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 0 B 6.9 GiB 1016 GiB 71.58 1.23 82 up
13 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.8 GiB 1.1 TiB 68.57 1.18 81 up
24 ssd 3.49309 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.9 MiB 6.1 GiB 1.1 TiB 69.50 1.20 82 up
27 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.4 TiB 4.6 MiB 4.7 GiB 1.0 TiB 70.15 1.21 83 up
31 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.5 MiB 6.6 GiB 1.1 TiB 69.37 1.20 82 up
32 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.9 MiB 6.5 GiB 1.0 TiB 70.95 1.22 82 up
33 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.3 GiB 1.1 TiB 68.73 1.18 82 up
38 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.6 MiB 6.5 GiB 1.0 TiB 71.03 1.22 82 up
6 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 1.2 MiB 7.1 GiB 1.0 TiB 70.83 1.22 82 up
7 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 29 MiB 7.5 GiB 1.0 TiB 71.12 1.23 82 up
8 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.7 GiB 1.1 TiB 68.62 1.18 81 up
11 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.6 GiB 1.1 TiB 69.33 1.20 82 up
14 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.2 MiB 7.1 GiB 986 GiB 72.45 1.25 82 up
25 ssd 3.49309 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 2.0 MiB 6.1 GiB 1.1 TiB 67.93 1.17 82 up
34 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.1 MiB 6.2 GiB 1.1 TiB 67.82 1.17 82 up
35 ssd 3.49199 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 798 KiB 6.0 GiB 1.2 TiB 65.57 1.13 82 up
36 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.4 TiB 3.1 MiB 6.2 GiB 1.0 TiB 70.19 1.21 82 up
39 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 6.7 GiB 903 GiB 74.76 1.29 82 up
40 ssd 3.49309 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 2.1 MiB 5.0 GiB 932 GiB 73.95 1.27 82 up
TOTAL 247 TiB 143 TiB 142 TiB 188 MiB 308 GiB 104 TiB 58.01
MIN/MAX VAR: 0.80/1.30 STDDEV: 12.12
So the problem was, that two different rules was set on the OSD's, right?
Can I ask you another question?
We have also problems with not deep-scrubbed pg's and you said here (http://www.petasan.org/forums/?view=thread&id=702) that it is possible to set some options in the ceph_configuration (osd_scrub_begin_hour, osd_scrub_end_hour, osd_scrub_sleep, osd_scrub_load_threshold). If I change the settings there are interrupptions in the productive system, right?
Pages: 1 2
OSD balancer and pool size issue
killerodin
33 Posts
Quote from killerodin on July 12, 2022, 11:39 amHello together
I have two problems, I hope it's ok I put it together. Maybe it's related to each other.
First a overview of all:
We're running on version 3.0.1, have 3 Nodes with 11 SSD OSD's per Node and 4 Nodes with 2 HDD OSD's per Node.
Then we have three Pools
- device_health_metrics - this pool was created automaticly
- SSD-Pool, 900 PG's (we have started with 9 OSD's)
- HDD-Pool, 1024 PG'sOutput of ceph df detail
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 131 TiB 71 TiB 61 TiB 61 TiB 46.05
ssd 115 TiB 34 TiB 81 TiB 81 TiB 70.22
TOTAL 247 TiB 105 TiB 141 TiB 141 TiB 57.34--- POOLS ---
POOL ID PGS STORED (DATA) (OMAP) OBJECTS USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR
SSD-Pool 7 900 27 TiB 27 TiB 7.8 MiB 8.23M 81 TiB 81 TiB 23 MiB 89.29 3.2 TiB N/A N/A 8.23M 0 B 0 B
HDD-Pool 8 1024 20 TiB 20 TiB 2.8 MiB 5.27M 60 TiB 60 TiB 8.4 MiB 49.13 21 TiB N/A N/A 5.27M 0 B 0 B
device_health_metrics 9 1 24 MiB 0 B 24 MiB 46 72 MiB 0 B 72 MiB 0 6.9 TiB N/A N/A 46 0 B 0 BIn my opinion the SSD pool should have 125 TiB Size (33 x 3.8 TiB) but the last resize when we add 3 OSD's to the Nodes the pool doesn't resize automaticly (until now it was always like this)
Then we have a balancer active
Output of ceph balancer status
{
"active": true,
"last_optimize_duration": "0:00:00.036000",
"last_optimize_started": "Tue Jul 12 13:20:47 2022",
"mode": "crush-compat",
"optimize_result": "Some osds belong to multiple subtrees: {0: ['SSD-Pool', 'default'], 1: ['SSD-Pool', 'default'], 2: ['SSD-Pool', 'default'], 3: ['SSD-Pool', 'default'], 4: ['SSD-Pool', 'default'], 5: ['SSD-Pool', 'default'], 6: ['SSD-Pool', 'default'], 7: ['SSD-Pool', 'default'], 8: ['SSD-Pool', 'default'], 9: ['SSD-Pool', 'default'], 10: ['SSD-Pool', 'default'], 11: ['SSD-Pool', 'default'], 12: ['SSD-Pool', 'default'], 13: ['SSD-Pool', 'default'], 14: ['SSD-Pool', 'default'], 15: ['HDD-Pool', 'default'], 16: ['HDD-Pool', 'default'], 17: ['HDD-Pool', 'default'], 18: ['HDD-Pool', 'default'], 19: ['HDD-Pool', 'default'], 20: ['HDD-Pool', 'default'], 21: ['HDD-Pool', 'default'], 22: ['HDD-Pool', 'default'], 23: ['SSD-Pool', 'default'], 24: ['SSD-Pool', 'default'], 25: ['SSD-Pool', 'default'], 26: ['SSD-Pool', 'default'], 27: ['SSD-Pool', 'default'], 28: ['SSD-Pool', 'default'], 29: ['SSD-Pool', 'default'], 30: ['SSD-Pool', 'default'], 31: ['SSD-Pool', 'default'], 32: ['SSD-Pool', 'default'], 33: ['SSD-Pool', 'default'], 34: ['SSD-Pool', 'default'], 35: ['SSD-Pool', 'default'], 36: ['SSD-Pool', 'default'], 37: ['SSD-Pool', 'default'], 38: ['SSD-Pool', 'default'], 39: ['SSD-Pool', 'default'], 40: ['SSD-Pool', 'default']}",
"plans": []
}But the output of ceph osd df shows a unbalance and I don't know why
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
15 hdd 16.42969 1.00000 16 TiB 7.2 TiB 7.2 TiB 3.9 MiB 11 GiB 9.2 TiB 44.02 0.77 367 up
16 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.6 TiB 46 KiB 12 GiB 8.7 TiB 46.87 0.82 391 up
17 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 3.8 MiB 12 GiB 8.6 TiB 47.69 0.83 398 up
18 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 459 KiB 11 GiB 8.7 TiB 47.13 0.82 393 up
19 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.6 TiB 166 KiB 11 GiB 8.7 TiB 46.85 0.82 391 up
20 hdd 16.42969 1.00000 16 TiB 7.4 TiB 7.3 TiB 212 KiB 11 GiB 9.1 TiB 44.75 0.78 373 up
21 hdd 16.42969 1.00000 16 TiB 7.6 TiB 7.5 TiB 133 KiB 11 GiB 8.8 TiB 46.21 0.81 385 up
22 hdd 16.42969 1.00000 16 TiB 7.4 TiB 7.3 TiB 23 MiB 11 GiB 9.1 TiB 44.86 0.78 375 up0 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 4.9 MiB 6.9 GiB 1.2 TiB 64.97 1.13 77 up
1 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.8 GiB 1.0 TiB 70.07 1.22 84 up
2 ssd 3.49300 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 342 KiB 6.7 GiB 1.3 TiB 64.09 1.12 78 up
9 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 3.8 MiB 6.8 GiB 1.1 TiB 69.40 1.21 81 up
12 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 0 B 7.0 GiB 801 GiB 77.62 1.35 84 up
23 ssd 3.49309 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.6 MiB 5.8 GiB 1.3 TiB 64.09 1.12 73 up
26 ssd 3.49309 1.00000 3.5 TiB 3.0 TiB 3.0 TiB 84 KiB 5.5 GiB 479 GiB 86.60 1.51 105 up
28 ssd 3.49199 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 3.3 MiB 5.8 GiB 1.3 TiB 64.07 1.12 74 up
29 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.6 MiB 6.4 GiB 1.1 TiB 67.95 1.18 80 up
30 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.1 GiB 1.1 TiB 69.47 1.21 83 up
37 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 2.6 MiB 6.5 GiB 926 GiB 74.11 1.29 82 up
3 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 959 GiB 73.20 1.28 82 up
4 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 3.8 MiB 6.9 GiB 1.1 TiB 67.09 1.17 79 up
5 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 25 MiB 7.7 GiB 987 GiB 72.40 1.26 85 up
10 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 0 B 7.1 GiB 854 GiB 76.12 1.33 88 up
13 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 5.6 MiB 6.6 GiB 1.1 TiB 69.29 1.21 82 up
24 ssd 3.49309 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.9 MiB 5.5 GiB 1.3 TiB 62.71 1.09 74 up
27 ssd 3.49309 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 4.8 MiB 4.2 GiB 1.2 TiB 64.88 1.13 77 up
31 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.5 MiB 6.3 GiB 1.0 TiB 70.14 1.22 83 up
32 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.5 TiB 2.9 MiB 6.6 GiB 959 GiB 73.17 1.28 84 up
33 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 1.6 MiB 6.4 GiB 1.0 TiB 70.97 1.24 84 up
38 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.7 MiB 6.4 GiB 983 GiB 72.50 1.26 83 up
6 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 1.2 MiB 7.1 GiB 855 GiB 76.11 1.33 88 up
7 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.9 MiB 7.6 GiB 1.0 TiB 71.08 1.24 80 up
8 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.8 GiB 1.1 TiB 69.38 1.21 81 up
11 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.5 GiB 1.1 TiB 67.79 1.18 80 up
14 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 2.2 MiB 6.7 GiB 1.2 TiB 64.90 1.13 73 up
25 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.6 MiB 5.6 GiB 1.0 TiB 70.92 1.24 86 up
34 ssd 3.49199 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.2 MiB 6.1 GiB 1.3 TiB 63.28 1.10 78 up
35 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 684 KiB 6.1 GiB 1016 GiB 71.59 1.25 89 up
36 ssd 3.49199 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 3.2 MiB 6.1 GiB 1.1 TiB 67.16 1.17 78 up
39 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 0 B 6.6 GiB 1.0 TiB 70.96 1.24 78 up
40 ssd 3.49309 1.00000 3.5 TiB 2.8 TiB 2.8 TiB 2.1 MiB 5.0 GiB 742 GiB 79.25 1.38 89 up
TOTAL 247 TiB 141 TiB 141 TiB 158 MiB 302 GiB 105 TiB 57.34What can I do, that the balancer works better and why don't resize my pool 'SSD-Pool' automaticly?
Hello together
I have two problems, I hope it's ok I put it together. Maybe it's related to each other.
First a overview of all:
We're running on version 3.0.1, have 3 Nodes with 11 SSD OSD's per Node and 4 Nodes with 2 HDD OSD's per Node.
Then we have three Pools
- device_health_metrics - this pool was created automaticly
- SSD-Pool, 900 PG's (we have started with 9 OSD's)
- HDD-Pool, 1024 PG's
Output of ceph df detail
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 131 TiB 71 TiB 61 TiB 61 TiB 46.05
ssd 115 TiB 34 TiB 81 TiB 81 TiB 70.22
TOTAL 247 TiB 105 TiB 141 TiB 141 TiB 57.34
--- POOLS ---
POOL ID PGS STORED (DATA) (OMAP) OBJECTS USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR
SSD-Pool 7 900 27 TiB 27 TiB 7.8 MiB 8.23M 81 TiB 81 TiB 23 MiB 89.29 3.2 TiB N/A N/A 8.23M 0 B 0 B
HDD-Pool 8 1024 20 TiB 20 TiB 2.8 MiB 5.27M 60 TiB 60 TiB 8.4 MiB 49.13 21 TiB N/A N/A 5.27M 0 B 0 B
device_health_metrics 9 1 24 MiB 0 B 24 MiB 46 72 MiB 0 B 72 MiB 0 6.9 TiB N/A N/A 46 0 B 0 B
In my opinion the SSD pool should have 125 TiB Size (33 x 3.8 TiB) but the last resize when we add 3 OSD's to the Nodes the pool doesn't resize automaticly (until now it was always like this)
Then we have a balancer active
Output of ceph balancer status
{
"active": true,
"last_optimize_duration": "0:00:00.036000",
"last_optimize_started": "Tue Jul 12 13:20:47 2022",
"mode": "crush-compat",
"optimize_result": "Some osds belong to multiple subtrees: {0: ['SSD-Pool', 'default'], 1: ['SSD-Pool', 'default'], 2: ['SSD-Pool', 'default'], 3: ['SSD-Pool', 'default'], 4: ['SSD-Pool', 'default'], 5: ['SSD-Pool', 'default'], 6: ['SSD-Pool', 'default'], 7: ['SSD-Pool', 'default'], 8: ['SSD-Pool', 'default'], 9: ['SSD-Pool', 'default'], 10: ['SSD-Pool', 'default'], 11: ['SSD-Pool', 'default'], 12: ['SSD-Pool', 'default'], 13: ['SSD-Pool', 'default'], 14: ['SSD-Pool', 'default'], 15: ['HDD-Pool', 'default'], 16: ['HDD-Pool', 'default'], 17: ['HDD-Pool', 'default'], 18: ['HDD-Pool', 'default'], 19: ['HDD-Pool', 'default'], 20: ['HDD-Pool', 'default'], 21: ['HDD-Pool', 'default'], 22: ['HDD-Pool', 'default'], 23: ['SSD-Pool', 'default'], 24: ['SSD-Pool', 'default'], 25: ['SSD-Pool', 'default'], 26: ['SSD-Pool', 'default'], 27: ['SSD-Pool', 'default'], 28: ['SSD-Pool', 'default'], 29: ['SSD-Pool', 'default'], 30: ['SSD-Pool', 'default'], 31: ['SSD-Pool', 'default'], 32: ['SSD-Pool', 'default'], 33: ['SSD-Pool', 'default'], 34: ['SSD-Pool', 'default'], 35: ['SSD-Pool', 'default'], 36: ['SSD-Pool', 'default'], 37: ['SSD-Pool', 'default'], 38: ['SSD-Pool', 'default'], 39: ['SSD-Pool', 'default'], 40: ['SSD-Pool', 'default']}",
"plans": []
}
But the output of ceph osd df shows a unbalance and I don't know why
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
15 hdd 16.42969 1.00000 16 TiB 7.2 TiB 7.2 TiB 3.9 MiB 11 GiB 9.2 TiB 44.02 0.77 367 up
16 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.6 TiB 46 KiB 12 GiB 8.7 TiB 46.87 0.82 391 up
17 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 3.8 MiB 12 GiB 8.6 TiB 47.69 0.83 398 up
18 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 459 KiB 11 GiB 8.7 TiB 47.13 0.82 393 up
19 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.6 TiB 166 KiB 11 GiB 8.7 TiB 46.85 0.82 391 up
20 hdd 16.42969 1.00000 16 TiB 7.4 TiB 7.3 TiB 212 KiB 11 GiB 9.1 TiB 44.75 0.78 373 up
21 hdd 16.42969 1.00000 16 TiB 7.6 TiB 7.5 TiB 133 KiB 11 GiB 8.8 TiB 46.21 0.81 385 up
22 hdd 16.42969 1.00000 16 TiB 7.4 TiB 7.3 TiB 23 MiB 11 GiB 9.1 TiB 44.86 0.78 375 up
0 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 4.9 MiB 6.9 GiB 1.2 TiB 64.97 1.13 77 up
1 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.8 GiB 1.0 TiB 70.07 1.22 84 up
2 ssd 3.49300 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 342 KiB 6.7 GiB 1.3 TiB 64.09 1.12 78 up
9 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 3.8 MiB 6.8 GiB 1.1 TiB 69.40 1.21 81 up
12 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 0 B 7.0 GiB 801 GiB 77.62 1.35 84 up
23 ssd 3.49309 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.6 MiB 5.8 GiB 1.3 TiB 64.09 1.12 73 up
26 ssd 3.49309 1.00000 3.5 TiB 3.0 TiB 3.0 TiB 84 KiB 5.5 GiB 479 GiB 86.60 1.51 105 up
28 ssd 3.49199 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 3.3 MiB 5.8 GiB 1.3 TiB 64.07 1.12 74 up
29 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.6 MiB 6.4 GiB 1.1 TiB 67.95 1.18 80 up
30 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.1 GiB 1.1 TiB 69.47 1.21 83 up
37 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 2.6 MiB 6.5 GiB 926 GiB 74.11 1.29 82 up
3 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 959 GiB 73.20 1.28 82 up
4 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 3.8 MiB 6.9 GiB 1.1 TiB 67.09 1.17 79 up
5 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 25 MiB 7.7 GiB 987 GiB 72.40 1.26 85 up
10 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 0 B 7.1 GiB 854 GiB 76.12 1.33 88 up
13 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 5.6 MiB 6.6 GiB 1.1 TiB 69.29 1.21 82 up
24 ssd 3.49309 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.9 MiB 5.5 GiB 1.3 TiB 62.71 1.09 74 up
27 ssd 3.49309 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 4.8 MiB 4.2 GiB 1.2 TiB 64.88 1.13 77 up
31 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.5 MiB 6.3 GiB 1.0 TiB 70.14 1.22 83 up
32 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.5 TiB 2.9 MiB 6.6 GiB 959 GiB 73.17 1.28 84 up
33 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 1.6 MiB 6.4 GiB 1.0 TiB 70.97 1.24 84 up
38 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.7 MiB 6.4 GiB 983 GiB 72.50 1.26 83 up
6 ssd 3.49300 1.00000 3.5 TiB 2.7 TiB 2.7 TiB 1.2 MiB 7.1 GiB 855 GiB 76.11 1.33 88 up
7 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.9 MiB 7.6 GiB 1.0 TiB 71.08 1.24 80 up
8 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.8 GiB 1.1 TiB 69.38 1.21 81 up
11 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.5 GiB 1.1 TiB 67.79 1.18 80 up
14 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 2.2 MiB 6.7 GiB 1.2 TiB 64.90 1.13 73 up
25 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.6 MiB 5.6 GiB 1.0 TiB 70.92 1.24 86 up
34 ssd 3.49199 1.00000 3.5 TiB 2.2 TiB 2.2 TiB 4.2 MiB 6.1 GiB 1.3 TiB 63.28 1.10 78 up
35 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 684 KiB 6.1 GiB 1016 GiB 71.59 1.25 89 up
36 ssd 3.49199 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 3.2 MiB 6.1 GiB 1.1 TiB 67.16 1.17 78 up
39 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 0 B 6.6 GiB 1.0 TiB 70.96 1.24 78 up
40 ssd 3.49309 1.00000 3.5 TiB 2.8 TiB 2.8 TiB 2.1 MiB 5.0 GiB 742 GiB 79.25 1.38 89 up
TOTAL 247 TiB 141 TiB 141 TiB 158 MiB 302 GiB 105 TiB 57.34
What can I do, that the balancer works better and why don't resize my pool 'SSD-Pool' automaticly?
admin
2,930 Posts
Quote from admin on July 12, 2022, 10:17 pmCan you try changing balancer mode to upmap and see if it solves the balacing issue.
Yes the remaining capacity is affected by the balancing not working correctly.
Can you try changing balancer mode to upmap and see if it solves the balacing issue.
Yes the remaining capacity is affected by the balancing not working correctly.
killerodin
33 Posts
Quote from killerodin on July 13, 2022, 6:45 amI tried over web but I got an error:
Error, min_compat_client "luminous" is required for pg-upmap.
Can I change this on the productive system during operation? And how can I change this? I am never sure what I am allowed to change and what has consequences
I tried over web but I got an error:
Error, min_compat_client "luminous" is required for pg-upmap.
Can I change this on the productive system during operation? And how can I change this? I am never sure what I am allowed to change and what has consequences
killerodin
33 Posts
Quote from killerodin on July 13, 2022, 9:45 amEdit:
Output of ceph versions is
{
"mon": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"mgr": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"osd": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 41
},
"mds": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"overall": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 50
}
}
Edit:
Output of ceph versions is
{
"mon": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"mgr": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"osd": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 41
},
"mds": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 3
},
"overall": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 50
}
}
admin
2,930 Posts
Quote from admin on July 13, 2022, 10:05 amthe balancer will not allow you to use pgupmap if you have pre-luminous clients
run
ceph features
to see if you have clients which require pre- luminous features
Another way is to keep the balancer as is and delete the default replicated rule, make sure your pools do not use this rule, you can delete the device health metrics pool if using it and if you wish to recreate it with ssd rule.
the balancer will not allow you to use pgupmap if you have pre-luminous clients
run
ceph features
to see if you have clients which require pre- luminous features
Another way is to keep the balancer as is and delete the default replicated rule, make sure your pools do not use this rule, you can delete the device health metrics pool if using it and if you wish to recreate it with ssd rule.
killerodin
33 Posts
Quote from killerodin on July 13, 2022, 12:03 pmceph features shows
"mon":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3"mds":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3"osd":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 41
"client":
"features": "0x2f018fb86aa42ada",
"release": "luminous",
"num": 3"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3"mgr":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
So is the highlighted client the problem?
For what exactly is the device_health_metrics pool? Did we need this and what happend when I delete it? Only this pool uses the default replicaed_rule.
Sorry that I ask several times but I prefer to be on the safe side and don't want to do things I don't understand
ceph features shows
"mon":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
"mds":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
"osd":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 41
"client":
"features": "0x2f018fb86aa42ada",
"release": "luminous",
"num": 3
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
"mgr":
"features": "0x3f01cfb8ffedffff",
"release": "luminous",
"num": 3
So is the highlighted client the problem?
For what exactly is the device_health_metrics pool? Did we need this and what happend when I delete it? Only this pool uses the default replicaed_rule.
Sorry that I ask several times but I prefer to be on the safe side and don't want to do things I don't understand
admin
2,930 Posts
Quote from admin on July 13, 2022, 1:27 pmThe client is ok, it is requesting luminous features. but not sure why you are getting the pre-luminous error when tsrying to set upmap.
you can change the device health pool to use the ssd rule from the ui, you can then delete the default rule:
ceph osd crush rule rm replicated_rule
The client is ok, it is requesting luminous features. but not sure why you are getting the pre-luminous error when tsrying to set upmap.
you can change the device health pool to use the ssd rule from the ui, you can then delete the default rule:
ceph osd crush rule rm replicated_rule
killerodin
33 Posts
Quote from killerodin on July 13, 2022, 2:02 pmHimm... strange
I've change the rule from the device_health_pool and delete the default rule. But I still can't change the balancer mode.
Himm... strange
I've change the rule from the device_health_pool and delete the default rule. But I still can't change the balancer mode.
admin
2,930 Posts
Quote from admin on July 13, 2022, 6:37 pmyou should not have to change the mode, the deletion of the default rule should make the balancer work with existing mode, look for the balancer status should no longer have the "Some osds belong to multiple subtrees"
you should not have to change the mode, the deletion of the default rule should make the balancer work with existing mode, look for the balancer status should no longer have the "Some osds belong to multiple subtrees"
killerodin
33 Posts
Quote from killerodin on July 14, 2022, 7:23 amHey
ceph balancer status says
{
"active": true,
"last_optimize_duration": "0:00:01.502575",
"last_optimize_started": "Thu Jul 14 08:12:52 2022",
"mode": "crush-compat",
"optimize_result": "Unable to find further optimization, change balancer mode and retry might help",
"plans": []
}and ceph osd df looks quite better, thanks a lot
root@KXCHMUEST020:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
15 hdd 16.42969 1.00000 16 TiB 7.6 TiB 7.5 TiB 978 KiB 11 GiB 8.8 TiB 46.29 0.80 376 up
16 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 2.3 MiB 12 GiB 8.6 TiB 47.75 0.82 388 up
17 hdd 16.42969 1.00000 16 TiB 7.9 TiB 7.8 TiB 3.9 MiB 12 GiB 8.6 TiB 47.86 0.83 389 up
18 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 426 KiB 12 GiB 8.6 TiB 47.66 0.82 387 up
19 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.7 TiB 3.0 MiB 12 GiB 8.6 TiB 47.49 0.82 386 up
20 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 3.8 MiB 12 GiB 8.7 TiB 47.06 0.81 382 up
21 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 1.2 MiB 12 GiB 8.7 TiB 47.07 0.81 382 up
22 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 1.9 MiB 12 GiB 8.7 TiB 47.10 0.81 382 up
0 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.9 MiB 7.0 GiB 1.1 TiB 68.02 1.17 82 up
1 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 26 MiB 6.9 GiB 1.1 TiB 68.58 1.18 82 up
2 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 342 KiB 6.9 GiB 1.1 TiB 67.09 1.16 82 up
9 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 980 KiB 6.9 GiB 1.1 TiB 69.38 1.20 82 up
12 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 881 GiB 75.38 1.30 82 up
23 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.6 MiB 6.0 GiB 988 GiB 72.38 1.25 82 up
26 ssd 3.49309 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 5.4 MiB 5.4 GiB 1.2 TiB 67.07 1.16 82 up
28 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 37 KiB 6.4 GiB 987 GiB 72.41 1.25 82 up
29 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.6 MiB 6.2 GiB 1.1 TiB 69.45 1.20 82 up
30 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.6 GiB 1.1 TiB 69.46 1.20 82 up
37 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 4.1 MiB 6.4 GiB 952 GiB 73.38 1.26 81 up
3 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 957 GiB 73.24 1.26 81 up
4 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 3.8 MiB 7.0 GiB 1.1 TiB 69.34 1.20 82 up
5 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.9 GiB 1.0 TiB 70.12 1.21 82 up
10 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 0 B 6.9 GiB 1016 GiB 71.58 1.23 82 up
13 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.8 GiB 1.1 TiB 68.57 1.18 81 up
24 ssd 3.49309 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.9 MiB 6.1 GiB 1.1 TiB 69.50 1.20 82 up
27 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.4 TiB 4.6 MiB 4.7 GiB 1.0 TiB 70.15 1.21 83 up
31 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.5 MiB 6.6 GiB 1.1 TiB 69.37 1.20 82 up
32 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.9 MiB 6.5 GiB 1.0 TiB 70.95 1.22 82 up
33 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.3 GiB 1.1 TiB 68.73 1.18 82 up
38 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.6 MiB 6.5 GiB 1.0 TiB 71.03 1.22 82 up
6 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 1.2 MiB 7.1 GiB 1.0 TiB 70.83 1.22 82 up
7 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 29 MiB 7.5 GiB 1.0 TiB 71.12 1.23 82 up
8 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.7 GiB 1.1 TiB 68.62 1.18 81 up
11 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.6 GiB 1.1 TiB 69.33 1.20 82 up
14 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.2 MiB 7.1 GiB 986 GiB 72.45 1.25 82 up
25 ssd 3.49309 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 2.0 MiB 6.1 GiB 1.1 TiB 67.93 1.17 82 up
34 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.1 MiB 6.2 GiB 1.1 TiB 67.82 1.17 82 up
35 ssd 3.49199 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 798 KiB 6.0 GiB 1.2 TiB 65.57 1.13 82 up
36 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.4 TiB 3.1 MiB 6.2 GiB 1.0 TiB 70.19 1.21 82 up
39 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 6.7 GiB 903 GiB 74.76 1.29 82 up
40 ssd 3.49309 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 2.1 MiB 5.0 GiB 932 GiB 73.95 1.27 82 up
TOTAL 247 TiB 143 TiB 142 TiB 188 MiB 308 GiB 104 TiB 58.01
MIN/MAX VAR: 0.80/1.30 STDDEV: 12.12So the problem was, that two different rules was set on the OSD's, right?
Can I ask you another question?
We have also problems with not deep-scrubbed pg's and you said here (http://www.petasan.org/forums/?view=thread&id=702) that it is possible to set some options in the ceph_configuration (osd_scrub_begin_hour, osd_scrub_end_hour, osd_scrub_sleep, osd_scrub_load_threshold). If I change the settings there are interrupptions in the productive system, right?
Hey
ceph balancer status says
{
"active": true,
"last_optimize_duration": "0:00:01.502575",
"last_optimize_started": "Thu Jul 14 08:12:52 2022",
"mode": "crush-compat",
"optimize_result": "Unable to find further optimization, change balancer mode and retry might help",
"plans": []
}
and ceph osd df looks quite better, thanks a lot
root@KXCHMUEST020:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
15 hdd 16.42969 1.00000 16 TiB 7.6 TiB 7.5 TiB 978 KiB 11 GiB 8.8 TiB 46.29 0.80 376 up
16 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 2.3 MiB 12 GiB 8.6 TiB 47.75 0.82 388 up
17 hdd 16.42969 1.00000 16 TiB 7.9 TiB 7.8 TiB 3.9 MiB 12 GiB 8.6 TiB 47.86 0.83 389 up
18 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.8 TiB 426 KiB 12 GiB 8.6 TiB 47.66 0.82 387 up
19 hdd 16.42969 1.00000 16 TiB 7.8 TiB 7.7 TiB 3.0 MiB 12 GiB 8.6 TiB 47.49 0.82 386 up
20 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 3.8 MiB 12 GiB 8.7 TiB 47.06 0.81 382 up
21 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 1.2 MiB 12 GiB 8.7 TiB 47.07 0.81 382 up
22 hdd 16.42969 1.00000 16 TiB 7.7 TiB 7.7 TiB 1.9 MiB 12 GiB 8.7 TiB 47.10 0.81 382 up
0 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.9 MiB 7.0 GiB 1.1 TiB 68.02 1.17 82 up
1 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 26 MiB 6.9 GiB 1.1 TiB 68.58 1.18 82 up
2 ssd 3.49300 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 342 KiB 6.9 GiB 1.1 TiB 67.09 1.16 82 up
9 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 980 KiB 6.9 GiB 1.1 TiB 69.38 1.20 82 up
12 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 881 GiB 75.38 1.30 82 up
23 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.6 MiB 6.0 GiB 988 GiB 72.38 1.25 82 up
26 ssd 3.49309 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 5.4 MiB 5.4 GiB 1.2 TiB 67.07 1.16 82 up
28 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 37 KiB 6.4 GiB 987 GiB 72.41 1.25 82 up
29 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.6 MiB 6.2 GiB 1.1 TiB 69.45 1.20 82 up
30 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.6 GiB 1.1 TiB 69.46 1.20 82 up
37 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 4.1 MiB 6.4 GiB 952 GiB 73.38 1.26 81 up
3 ssd 3.49300 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 7.1 GiB 957 GiB 73.24 1.26 81 up
4 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 3.8 MiB 7.0 GiB 1.1 TiB 69.34 1.20 82 up
5 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.9 GiB 1.0 TiB 70.12 1.21 82 up
10 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 0 B 6.9 GiB 1016 GiB 71.58 1.23 82 up
13 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.2 MiB 6.8 GiB 1.1 TiB 68.57 1.18 81 up
24 ssd 3.49309 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.9 MiB 6.1 GiB 1.1 TiB 69.50 1.20 82 up
27 ssd 3.49309 1.00000 3.5 TiB 2.5 TiB 2.4 TiB 4.6 MiB 4.7 GiB 1.0 TiB 70.15 1.21 83 up
31 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 1.5 MiB 6.6 GiB 1.1 TiB 69.37 1.20 82 up
32 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.9 MiB 6.5 GiB 1.0 TiB 70.95 1.22 82 up
33 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 25 MiB 6.3 GiB 1.1 TiB 68.73 1.18 82 up
38 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 4.6 MiB 6.5 GiB 1.0 TiB 71.03 1.22 82 up
6 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 1.2 MiB 7.1 GiB 1.0 TiB 70.83 1.22 82 up
7 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 29 MiB 7.5 GiB 1.0 TiB 71.12 1.23 82 up
8 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.7 GiB 1.1 TiB 68.62 1.18 81 up
11 ssd 3.49300 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 0 B 6.6 GiB 1.1 TiB 69.33 1.20 82 up
14 ssd 3.49300 1.00000 3.5 TiB 2.5 TiB 2.5 TiB 2.2 MiB 7.1 GiB 986 GiB 72.45 1.25 82 up
25 ssd 3.49309 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 2.0 MiB 6.1 GiB 1.1 TiB 67.93 1.17 82 up
34 ssd 3.49199 1.00000 3.5 TiB 2.4 TiB 2.4 TiB 4.1 MiB 6.2 GiB 1.1 TiB 67.82 1.17 82 up
35 ssd 3.49199 1.00000 3.5 TiB 2.3 TiB 2.3 TiB 798 KiB 6.0 GiB 1.2 TiB 65.57 1.13 82 up
36 ssd 3.49199 1.00000 3.5 TiB 2.5 TiB 2.4 TiB 3.1 MiB 6.2 GiB 1.0 TiB 70.19 1.21 82 up
39 ssd 3.49199 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 0 B 6.7 GiB 903 GiB 74.76 1.29 82 up
40 ssd 3.49309 1.00000 3.5 TiB 2.6 TiB 2.6 TiB 2.1 MiB 5.0 GiB 932 GiB 73.95 1.27 82 up
TOTAL 247 TiB 143 TiB 142 TiB 188 MiB 308 GiB 104 TiB 58.01
MIN/MAX VAR: 0.80/1.30 STDDEV: 12.12
So the problem was, that two different rules was set on the OSD's, right?
Can I ask you another question?
We have also problems with not deep-scrubbed pg's and you said here (http://www.petasan.org/forums/?view=thread&id=702) that it is possible to set some options in the ceph_configuration (osd_scrub_begin_hour, osd_scrub_end_hour, osd_scrub_sleep, osd_scrub_load_threshold). If I change the settings there are interrupptions in the productive system, right?