Slowing re-balance after node addition.
admin
2,930 Posts
February 11, 2019, 3:13 pmQuote from admin on February 11, 2019, 3:13 pmWe now recommend users to slow down re-balance speed when adding a new node to an existing cluster that already stores a lot of data. Since addition of new nodes will re-distribute / re-balance existing data from current nodes, depending on available hardware and amount of data, this could put stress on the system to slow client io and in some cases to almost freeze it.
This is a general Ceph issue. PetaSAN already configures the re-balance/backfill speed lower that Ceph defaults, however it the above cases it is better to slow it down even more:
# After adding a node
ceph tell osd.* injectargs '--osd_max_backfills 1'
ceph tell osd.* injectargs '--osd_recovery_max_active 1'
ceph tell osd.* injectargs '--osd_recovery_sleep 2'
# After rebalance completes (monitor via PG Status on Dashboard)
ceph tell osd.* injectargs '--osd_recovery_sleep 0'
This will be added to v. 2.3 via ui.
We now recommend users to slow down re-balance speed when adding a new node to an existing cluster that already stores a lot of data. Since addition of new nodes will re-distribute / re-balance existing data from current nodes, depending on available hardware and amount of data, this could put stress on the system to slow client io and in some cases to almost freeze it.
This is a general Ceph issue. PetaSAN already configures the re-balance/backfill speed lower that Ceph defaults, however it the above cases it is better to slow it down even more:
# After adding a node
ceph tell osd.* injectargs '--osd_max_backfills 1'
ceph tell osd.* injectargs '--osd_recovery_max_active 1'
ceph tell osd.* injectargs '--osd_recovery_sleep 2'
# After rebalance completes (monitor via PG Status on Dashboard)
ceph tell osd.* injectargs '--osd_recovery_sleep 0'
This will be added to v. 2.3 via ui.
Last edited on February 11, 2019, 3:25 pm by admin · #1
Slowing re-balance after node addition.
admin
2,930 Posts
Quote from admin on February 11, 2019, 3:13 pmWe now recommend users to slow down re-balance speed when adding a new node to an existing cluster that already stores a lot of data. Since addition of new nodes will re-distribute / re-balance existing data from current nodes, depending on available hardware and amount of data, this could put stress on the system to slow client io and in some cases to almost freeze it.
This is a general Ceph issue. PetaSAN already configures the re-balance/backfill speed lower that Ceph defaults, however it the above cases it is better to slow it down even more:
# After adding a node
ceph tell osd.* injectargs '--osd_max_backfills 1'
ceph tell osd.* injectargs '--osd_recovery_max_active 1'
ceph tell osd.* injectargs '--osd_recovery_sleep 2'# After rebalance completes (monitor via PG Status on Dashboard)
ceph tell osd.* injectargs '--osd_recovery_sleep 0'This will be added to v. 2.3 via ui.
We now recommend users to slow down re-balance speed when adding a new node to an existing cluster that already stores a lot of data. Since addition of new nodes will re-distribute / re-balance existing data from current nodes, depending on available hardware and amount of data, this could put stress on the system to slow client io and in some cases to almost freeze it.
This is a general Ceph issue. PetaSAN already configures the re-balance/backfill speed lower that Ceph defaults, however it the above cases it is better to slow it down even more:
# After adding a node
ceph tell osd.* injectargs '--osd_max_backfills 1'
ceph tell osd.* injectargs '--osd_recovery_max_active 1'
ceph tell osd.* injectargs '--osd_recovery_sleep 2'
# After rebalance completes (monitor via PG Status on Dashboard)
ceph tell osd.* injectargs '--osd_recovery_sleep 0'
This will be added to v. 2.3 via ui.