Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Slowing re-balance after node addition.

We now recommend users to slow down re-balance speed when adding a new node to an existing cluster that already stores a lot of data. Since addition of new nodes will re-distribute / re-balance existing data from current nodes, depending on available hardware and amount of data, this could put stress on the system to slow client io and in some cases to almost freeze it.

This is a general Ceph issue. PetaSAN already configures the re-balance/backfill speed lower that Ceph defaults, however it the above cases it is better to slow it down even more:

# After adding a node

ceph tell osd.* injectargs '--osd_max_backfills 1'
ceph tell osd.* injectargs '--osd_recovery_max_active 1'
ceph tell osd.* injectargs '--osd_recovery_sleep 2'

# After rebalance completes (monitor via PG Status on Dashboard)
ceph tell osd.* injectargs '--osd_recovery_sleep 0'

This will be added to v. 2.3 via ui.