Migrate from Proxmox CEPH to PetaSAN!
Gilberto
5 Posts
December 6, 2018, 5:35 pmQuote from Gilberto on December 6, 2018, 5:35 pmHi there!
We about to migrate our 6 servers Proxmox CEPH to PetaSAN!
Each server is an IBM System x3100 M4, with 16 RAM each one.
The HDD is a mix of SATA with 5400 RPM and 7200 RPM.
Additionally, there 's one SSD acting as journal device.
We use 2 facilities, and there's 3 servers on each of those facilities.
Between this facilities, we have an optical cable which provide 1 GB speed! =( I know! This is awful!....
We experience a lot slow down when one of server crash and when Ceph need resync something.
I already tried lower the weight of low speed HDD, but no effect!
Now question is: with PetaSAN we can achive more speed?
Thanks
Hi there!
We about to migrate our 6 servers Proxmox CEPH to PetaSAN!
Each server is an IBM System x3100 M4, with 16 RAM each one.
The HDD is a mix of SATA with 5400 RPM and 7200 RPM.
Additionally, there 's one SSD acting as journal device.
We use 2 facilities, and there's 3 servers on each of those facilities.
Between this facilities, we have an optical cable which provide 1 GB speed! =( I know! This is awful!....
We experience a lot slow down when one of server crash and when Ceph need resync something.
I already tried lower the weight of low speed HDD, but no effect!
Now question is: with PetaSAN we can achive more speed?
Thanks
admin
2,930 Posts
December 6, 2018, 8:05 pmQuote from admin on December 6, 2018, 8:05 pmHi,
Since both use Ceph, i doubt there will be significant differences. In order to get decent performance you need to have good hardware, the low speed during recovery is an indication the cluster is under-powered. There are some config parameters that limit the recovery load, which may help in your case, but is not a real solution.
Good luck..
Hi,
Since both use Ceph, i doubt there will be significant differences. In order to get decent performance you need to have good hardware, the low speed during recovery is an indication the cluster is under-powered. There are some config parameters that limit the recovery load, which may help in your case, but is not a real solution.
Good luck..
Gilberto
5 Posts
December 6, 2018, 8:21 pmQuote from Gilberto on December 6, 2018, 8:21 pm
Quote from admin on December 6, 2018, 8:05 pm
Hi,
Since both use Ceph, i doubt there will be significant differences. In order to get decent performance you need to have good hardware, the low speed during recovery is an indication the cluster is under-powered. There are some config parameters that limit the recovery load, which may help in your case, but is not a real solution.
Good luck..
What is your suggestion? How parameters could be change to reduce the recovery workload???
Maybe osd recovery max active ??
Just to show: ceph -s
ceph -s
cluster:
id: e67534b4-0a66-48db-ad6f-aa0868e962d8
health: HEALTH_WARN
348369/2467839 objects misplaced (14.116%)
Degraded data redundancy: 396/2467839 objects degraded (0.016%), 85 pgs degraded
services:
mon: 5 daemons, quorum pve-ceph01,pve-ceph02,pve-ceph03,pve-ceph04,pve-ceph05
mgr: pve-ceph05(active), standbys: pve-ceph01, pve-ceph02, pve-ceph03, pve-ceph04
osd: 21 osds: 21 up, 21 in; 180 remapped pgs
data:
pools: 1 pools, 512 pgs
objects: 822.61k objects, 3.02TiB
usage: 9.26TiB used, 53.5TiB / 62.8TiB avail
pgs: 396/2467839 objects degraded (0.016%)
348369/2467839 objects misplaced (14.116%)
283 active+clean
142 active+remapped+backfill_wait
48 active+recovery_wait+degraded
37 active+recovery_wait+degraded+remapped
1 active+recovery_wait
1 active+remapped+backfilling
io:
client: 10.6KiB/s rd, 5.61KiB/s wr, 0op/s rd, 1op/s wr
recovery: 330KiB/s, 0objects/s
I will appreciated that, if you could help!
Thanks
Quote from admin on December 6, 2018, 8:05 pm
Hi,
Since both use Ceph, i doubt there will be significant differences. In order to get decent performance you need to have good hardware, the low speed during recovery is an indication the cluster is under-powered. There are some config parameters that limit the recovery load, which may help in your case, but is not a real solution.
Good luck..
What is your suggestion? How parameters could be change to reduce the recovery workload???
Maybe osd recovery max active ??
Just to show: ceph -s
ceph -s
cluster:
id: e67534b4-0a66-48db-ad6f-aa0868e962d8
health: HEALTH_WARN
348369/2467839 objects misplaced (14.116%)
Degraded data redundancy: 396/2467839 objects degraded (0.016%), 85 pgs degraded
services:
mon: 5 daemons, quorum pve-ceph01,pve-ceph02,pve-ceph03,pve-ceph04,pve-ceph05
mgr: pve-ceph05(active), standbys: pve-ceph01, pve-ceph02, pve-ceph03, pve-ceph04
osd: 21 osds: 21 up, 21 in; 180 remapped pgs
data:
pools: 1 pools, 512 pgs
objects: 822.61k objects, 3.02TiB
usage: 9.26TiB used, 53.5TiB / 62.8TiB avail
pgs: 396/2467839 objects degraded (0.016%)
348369/2467839 objects misplaced (14.116%)
283 active+clean
142 active+remapped+backfill_wait
48 active+recovery_wait+degraded
37 active+recovery_wait+degraded+remapped
1 active+recovery_wait
1 active+remapped+backfilling
io:
client: 10.6KiB/s rd, 5.61KiB/s wr, 0op/s rd, 1op/s wr
recovery: 330KiB/s, 0objects/s
I will appreciated that, if you could help!
Thanks
Last edited on December 6, 2018, 8:33 pm by Gilberto · #3
Gilberto
5 Posts
December 6, 2018, 9:12 pmQuote from Gilberto on December 6, 2018, 9:12 pmI tried:
ceph tell osd.* injectargs '--osd-max-backfills 1'
ceph tell osd.* injectargs '--osd-max-recovery-threads 1'
ceph tell osd.* injectargs '--osd-recovery-op-priority 1'
ceph tell osd.* injectargs '--osd-client-op-priority 63'
ceph tell osd.* injectargs '--osd-recovery-max-active 1'
ceph osd set nodeep-scrub
But seems no effect. Any suggestion?
I tried:
ceph tell osd.* injectargs '--osd-max-backfills 1'
ceph tell osd.* injectargs '--osd-max-recovery-threads 1'
ceph tell osd.* injectargs '--osd-recovery-op-priority 1'
ceph tell osd.* injectargs '--osd-client-op-priority 63'
ceph tell osd.* injectargs '--osd-recovery-max-active 1'
ceph osd set nodeep-scrub
But seems no effect. Any suggestion?
admin
2,930 Posts
December 6, 2018, 9:14 pmQuote from admin on December 6, 2018, 9:14 pmTry these settings in conf file
osd_max_backfills = 1
osd_recovery_sleep = 1
osd_recovery_max_active = 1
osd_recovery_priority = 1
osd_recovery_op_priority = 1
osd_client_op_priority = 63
osd_scrub_during_recovery = false
+ use CFQ io scheduler for hdds
Try these settings in conf file
osd_max_backfills = 1
osd_recovery_sleep = 1
osd_recovery_max_active = 1
osd_recovery_priority = 1
osd_recovery_op_priority = 1
osd_client_op_priority = 63
osd_scrub_during_recovery = false
+ use CFQ io scheduler for hdds
Gilberto
5 Posts
December 6, 2018, 9:36 pmQuote from Gilberto on December 6, 2018, 9:36 pmYep... I am trying this too:
ceph tell osd.* injectargs '--osd-max-scrubs 1'
ceph tell osd.* injectargs '--osd-scrub-max-interval 4838400'
ceph tell osd.* injectargs '--osd-scrub-min-interval 2419200'
ceph tell osd.* injectargs '--osd-deep-scrub-interval 2419200'
ceph tell osd.* injectargs '--osd-scrub-interval-randomize-ratio 1.0'
ceph tell osd.* injectargs '--osd-disk-thread-ioprio-class idle'
ceph tell osd.* injectargs '--osd-disk-thread-ioprio-priority 0'
ceph tell osd.* injectargs '--osd-scrub-chunk-max 1'
ceph tell osd.* injectargs '--osd-scrub-chunk-min 1'
ceph tell osd.* injectargs '--osd-deep-scrub-stride 1048576'
ceph tell osd.* injectargs '--osd-scrub-load-threshold 5.0'
ceph tell osd.* injectargs '--osd-scrub-sleep 0.1'
Yep... I am trying this too:
ceph tell osd.* injectargs '--osd-max-scrubs 1'
ceph tell osd.* injectargs '--osd-scrub-max-interval 4838400'
ceph tell osd.* injectargs '--osd-scrub-min-interval 2419200'
ceph tell osd.* injectargs '--osd-deep-scrub-interval 2419200'
ceph tell osd.* injectargs '--osd-scrub-interval-randomize-ratio 1.0'
ceph tell osd.* injectargs '--osd-disk-thread-ioprio-class idle'
ceph tell osd.* injectargs '--osd-disk-thread-ioprio-priority 0'
ceph tell osd.* injectargs '--osd-scrub-chunk-max 1'
ceph tell osd.* injectargs '--osd-scrub-chunk-min 1'
ceph tell osd.* injectargs '--osd-deep-scrub-stride 1048576'
ceph tell osd.* injectargs '--osd-scrub-load-threshold 5.0'
ceph tell osd.* injectargs '--osd-scrub-sleep 0.1'
Migrate from Proxmox CEPH to PetaSAN!
Gilberto
5 Posts
Quote from Gilberto on December 6, 2018, 5:35 pmHi there!
We about to migrate our 6 servers Proxmox CEPH to PetaSAN!
Each server is an IBM System x3100 M4, with 16 RAM each one.
The HDD is a mix of SATA with 5400 RPM and 7200 RPM.
Additionally, there 's one SSD acting as journal device.
We use 2 facilities, and there's 3 servers on each of those facilities.
Between this facilities, we have an optical cable which provide 1 GB speed! =( I know! This is awful!....
We experience a lot slow down when one of server crash and when Ceph need resync something.
I already tried lower the weight of low speed HDD, but no effect!
Now question is: with PetaSAN we can achive more speed?
Thanks
Hi there!
We about to migrate our 6 servers Proxmox CEPH to PetaSAN!
Each server is an IBM System x3100 M4, with 16 RAM each one.
The HDD is a mix of SATA with 5400 RPM and 7200 RPM.
Additionally, there 's one SSD acting as journal device.
We use 2 facilities, and there's 3 servers on each of those facilities.
Between this facilities, we have an optical cable which provide 1 GB speed! =( I know! This is awful!....
We experience a lot slow down when one of server crash and when Ceph need resync something.
I already tried lower the weight of low speed HDD, but no effect!
Now question is: with PetaSAN we can achive more speed?
Thanks
admin
2,930 Posts
Quote from admin on December 6, 2018, 8:05 pmHi,
Since both use Ceph, i doubt there will be significant differences. In order to get decent performance you need to have good hardware, the low speed during recovery is an indication the cluster is under-powered. There are some config parameters that limit the recovery load, which may help in your case, but is not a real solution.
Good luck..
Hi,
Since both use Ceph, i doubt there will be significant differences. In order to get decent performance you need to have good hardware, the low speed during recovery is an indication the cluster is under-powered. There are some config parameters that limit the recovery load, which may help in your case, but is not a real solution.
Good luck..
Gilberto
5 Posts
Quote from Gilberto on December 6, 2018, 8:21 pmQuote from admin on December 6, 2018, 8:05 pmHi,
Since both use Ceph, i doubt there will be significant differences. In order to get decent performance you need to have good hardware, the low speed during recovery is an indication the cluster is under-powered. There are some config parameters that limit the recovery load, which may help in your case, but is not a real solution.
Good luck..
What is your suggestion? How parameters could be change to reduce the recovery workload???
Maybe osd recovery max active ??
Just to show: ceph -s
ceph -s
cluster:
id: e67534b4-0a66-48db-ad6f-aa0868e962d8
health: HEALTH_WARN
348369/2467839 objects misplaced (14.116%)
Degraded data redundancy: 396/2467839 objects degraded (0.016%), 85 pgs degradedservices:
mon: 5 daemons, quorum pve-ceph01,pve-ceph02,pve-ceph03,pve-ceph04,pve-ceph05
mgr: pve-ceph05(active), standbys: pve-ceph01, pve-ceph02, pve-ceph03, pve-ceph04
osd: 21 osds: 21 up, 21 in; 180 remapped pgsdata:
pools: 1 pools, 512 pgs
objects: 822.61k objects, 3.02TiB
usage: 9.26TiB used, 53.5TiB / 62.8TiB avail
pgs: 396/2467839 objects degraded (0.016%)
348369/2467839 objects misplaced (14.116%)
283 active+clean
142 active+remapped+backfill_wait
48 active+recovery_wait+degraded
37 active+recovery_wait+degraded+remapped
1 active+recovery_wait
1 active+remapped+backfillingio:
client: 10.6KiB/s rd, 5.61KiB/s wr, 0op/s rd, 1op/s wr
recovery: 330KiB/s, 0objects/sI will appreciated that, if you could help!
Thanks
Quote from admin on December 6, 2018, 8:05 pmHi,
Since both use Ceph, i doubt there will be significant differences. In order to get decent performance you need to have good hardware, the low speed during recovery is an indication the cluster is under-powered. There are some config parameters that limit the recovery load, which may help in your case, but is not a real solution.
Good luck..
What is your suggestion? How parameters could be change to reduce the recovery workload???
Maybe osd recovery max active ??
Just to show: ceph -s
ceph -s
cluster:
id: e67534b4-0a66-48db-ad6f-aa0868e962d8
health: HEALTH_WARN
348369/2467839 objects misplaced (14.116%)
Degraded data redundancy: 396/2467839 objects degraded (0.016%), 85 pgs degraded
services:
mon: 5 daemons, quorum pve-ceph01,pve-ceph02,pve-ceph03,pve-ceph04,pve-ceph05
mgr: pve-ceph05(active), standbys: pve-ceph01, pve-ceph02, pve-ceph03, pve-ceph04
osd: 21 osds: 21 up, 21 in; 180 remapped pgs
data:
pools: 1 pools, 512 pgs
objects: 822.61k objects, 3.02TiB
usage: 9.26TiB used, 53.5TiB / 62.8TiB avail
pgs: 396/2467839 objects degraded (0.016%)
348369/2467839 objects misplaced (14.116%)
283 active+clean
142 active+remapped+backfill_wait
48 active+recovery_wait+degraded
37 active+recovery_wait+degraded+remapped
1 active+recovery_wait
1 active+remapped+backfilling
io:
client: 10.6KiB/s rd, 5.61KiB/s wr, 0op/s rd, 1op/s wr
recovery: 330KiB/s, 0objects/s
I will appreciated that, if you could help!
Thanks
Gilberto
5 Posts
Quote from Gilberto on December 6, 2018, 9:12 pmI tried:
ceph tell osd.* injectargs '--osd-max-backfills 1'
ceph tell osd.* injectargs '--osd-max-recovery-threads 1'
ceph tell osd.* injectargs '--osd-recovery-op-priority 1'
ceph tell osd.* injectargs '--osd-client-op-priority 63'
ceph tell osd.* injectargs '--osd-recovery-max-active 1'
ceph osd set nodeep-scrubBut seems no effect. Any suggestion?
I tried:
ceph tell osd.* injectargs '--osd-max-backfills 1'
ceph tell osd.* injectargs '--osd-max-recovery-threads 1'
ceph tell osd.* injectargs '--osd-recovery-op-priority 1'
ceph tell osd.* injectargs '--osd-client-op-priority 63'
ceph tell osd.* injectargs '--osd-recovery-max-active 1'
ceph osd set nodeep-scrub
But seems no effect. Any suggestion?
admin
2,930 Posts
Quote from admin on December 6, 2018, 9:14 pmTry these settings in conf file
osd_max_backfills = 1
osd_recovery_sleep = 1
osd_recovery_max_active = 1
osd_recovery_priority = 1
osd_recovery_op_priority = 1
osd_client_op_priority = 63
osd_scrub_during_recovery = false+ use CFQ io scheduler for hdds
Try these settings in conf file
osd_max_backfills = 1
osd_recovery_sleep = 1
osd_recovery_max_active = 1
osd_recovery_priority = 1
osd_recovery_op_priority = 1
osd_client_op_priority = 63
osd_scrub_during_recovery = false
+ use CFQ io scheduler for hdds
Gilberto
5 Posts
Quote from Gilberto on December 6, 2018, 9:36 pmYep... I am trying this too:
ceph tell osd.* injectargs '--osd-max-scrubs 1'
ceph tell osd.* injectargs '--osd-scrub-max-interval 4838400'
ceph tell osd.* injectargs '--osd-scrub-min-interval 2419200'
ceph tell osd.* injectargs '--osd-deep-scrub-interval 2419200'
ceph tell osd.* injectargs '--osd-scrub-interval-randomize-ratio 1.0'
ceph tell osd.* injectargs '--osd-disk-thread-ioprio-class idle'
ceph tell osd.* injectargs '--osd-disk-thread-ioprio-priority 0'
ceph tell osd.* injectargs '--osd-scrub-chunk-max 1'
ceph tell osd.* injectargs '--osd-scrub-chunk-min 1'
ceph tell osd.* injectargs '--osd-deep-scrub-stride 1048576'
ceph tell osd.* injectargs '--osd-scrub-load-threshold 5.0'
ceph tell osd.* injectargs '--osd-scrub-sleep 0.1'
Yep... I am trying this too:
ceph tell osd.* injectargs '--osd-max-scrubs 1'
ceph tell osd.* injectargs '--osd-scrub-max-interval 4838400'
ceph tell osd.* injectargs '--osd-scrub-min-interval 2419200'
ceph tell osd.* injectargs '--osd-deep-scrub-interval 2419200'
ceph tell osd.* injectargs '--osd-scrub-interval-randomize-ratio 1.0'
ceph tell osd.* injectargs '--osd-disk-thread-ioprio-class idle'
ceph tell osd.* injectargs '--osd-disk-thread-ioprio-priority 0'
ceph tell osd.* injectargs '--osd-scrub-chunk-max 1'
ceph tell osd.* injectargs '--osd-scrub-chunk-min 1'
ceph tell osd.* injectargs '--osd-deep-scrub-stride 1048576'
ceph tell osd.* injectargs '--osd-scrub-load-threshold 5.0'
ceph tell osd.* injectargs '--osd-scrub-sleep 0.1'