Forums - PetaSAN

ForumGeneral Discussionmix different sizes of SSD
You need to log in to create posts and topics. Login · Register
mix different sizes of SSD

Pages: 1 2

exitsys
43 Posts

October 10, 2020, 6:55 am
Quote from exitsys on October 10, 2020, 6:55 am
the cpu utilization and memory utilization are at about 0%.
Network utilization at 2% throughput at 20MB/s.
Everything is bored. Nevertheless, the overnight rebuild is only now with degraded data redundancy: 2267002/11139717 objects degraded (20.351%), 323 pgs degraded, 323 pgs undersized.
Is there anything wrong with this? I had 6 x 960Gb in it and exchanged it for 4 x 1920GB

the cpu utilization and memory utilization are at about 0%.
Network utilization at 2% throughput at 20MB/s.
Everything is bored. Nevertheless, the overnight rebuild is only now with degraded data redundancy: 2267002/11139717 objects degraded (20.351%), 323 pgs degraded, 323 pgs undersized.
Is there anything wrong with this? I had 6 x 960Gb in it and exchanged it for 4 x 1920GB

#11

admin
2,930 Posts

October 10, 2020, 8:52 am
Quote from admin on October 10, 2020, 8:52 am
it could well be my load guess was wrong as it totally depends on the environment. you have 2 issues:

your current recovery/backfill speed is slow and your cluster is not loaded: set the backfill speed to "average" in the ui, observe the load charts as you did earlier and include the disk % utilization/busy on all nodes, after 1-2 hours if all ok you can switch it to "fast" then maybe "very fast" later. You can look at the PG Status chart to give you estimates on when things will complete.

your earlier issue of iSCSI dropping: could be something other than load but hard for me to guess. i would look at the same load charts at the time of the iSCSI issue: was the load also low ? look at the PG Status charts, were any PGs inactive or down ?

it could well be my load guess was wrong as it totally depends on the environment. you have 2 issues:

your current recovery/backfill speed is slow and your cluster is not loaded: set the backfill speed to "average" in the ui, observe the load charts as you did earlier and include the disk % utilization/busy on all nodes, after 1-2 hours if all ok you can switch it to "fast" then maybe "very fast" later. You can look at the PG Status chart to give you estimates on when things will complete.

your earlier issue of iSCSI dropping: could be something other than load but hard for me to guess. i would look at the same load charts at the time of the iSCSI issue: was the load also low ? look at the PG Status charts, were any PGs inactive or down ?

Last edited on October 10, 2020, 8:53 am by admin · #12

exitsys
43 Posts

October 10, 2020, 1:32 pm
Quote from exitsys on October 10, 2020, 1:32 pm
Hi, I have now set Backfill Speed to very Fast.

as I said, I had swapped all SSds in the third note yesterday.
Node 1
CPU approx. 3%.
Memory approx. 19%
Disk Utilization about 9% on some
Network Utilization approx. 7%

Node2
CPU approx. 12%
Memory approx. 23%
Disk Utilization slightly between 7% and 36% on some
Network Utilization approx. 97%

Node3
CPU approx. 13%
Memory approx. 20%.
Disk Utilization about 66% on some
Network Utilization approx. 92%

In the time I have written this, Ceph Health has fallen from 15% degraded to 1.2%. So I want to say it's happened pretty damn fast now.

the bottleneck is now the backend network. i have it on active backup. Maybe I should set it to balance-alb?

Hi, I have now set Backfill Speed to very Fast.

as I said, I had swapped all SSds in the third note yesterday.
Node 1
CPU approx. 3%.
Memory approx. 19%
Disk Utilization about 9% on some
Network Utilization approx. 7%

Node2
CPU approx. 12%
Memory approx. 23%
Disk Utilization slightly between 7% and 36% on some
Network Utilization approx. 97%

Node3
CPU approx. 13%
Memory approx. 20%.
Disk Utilization about 66% on some
Network Utilization approx. 92%

In the time I have written this, Ceph Health has fallen from 15% degraded to 1.2%. So I want to say it's happened pretty damn fast now.

the bottleneck is now the backend network. i have it on active backup. Maybe I should set it to balance-alb?

Last edited on October 10, 2020, 1:37 pm by exitsys · #13

admin
2,930 Posts

October 10, 2020, 3:04 pm
Quote from admin on October 10, 2020, 3:04 pm
Yes it makes a big difference. i would advise you do not set the speed above average unless temporarily in cases like this and you monitor the load. I would recommend you use LACP if your switches support MLAG. For the iSCSI issue i recommend you look at the second point posted earlier. Good luck.

Yes it makes a big difference. i would advise you do not set the speed above average unless temporarily in cases like this and you monitor the load. I would recommend you use LACP if your switches support MLAG. For the iSCSI issue i recommend you look at the second point posted earlier. Good luck.

Last edited on October 10, 2020, 3:08 pm by admin · #14

Post Reply: mix different sizes of SSD

Cancel

Pages: 1 2