Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

OSD Management

Hello,

After using Petasan for 6-9 months now we've had to remove OSDs from pool for several reasons and had to learn to do this through CLI.  Is there any plans to add icons to gui to down/out/remove OSD from the cluster through the UI? I feel this would be really helpful, especially now that write cache has been enabled. Some people may want to enable that. Or remove smaller drives to replace with larger ones.

Brian

You can remove a stopped OSD from the ui,  there is a remove/delete button that shows up when the OSD is stopped.

However we do no provide a stop OSD on purpose, if you need to stop it, you have to do this yourself via:

systemctl stop ceph-osd@X

that is all, we left it on purpose so you should know what you are doing.

 

how about a button that sets the specific OSD weight to zero and then once the OSD has been drained a button to stop the OSD from the GUI could be made available.

This is important as if you upgrade an OSD, the correct way (according to the ceph dev docs) is to reweight the OSD to zero, wait until it reports as empty then stop and remove the OSD then delete the OSD from crush (which you already do provide a button for once an OSD is missing). The new OSD candidate gets labeled with the first available OSD number regardless once it is accepted into the cluster and the cluster rebalances itself.

Another feature on this line would be to have new OSD's automatically be set to a weight of zero. Then have another button to set the correct weights (which are typically the size of the OSD capacity in GB as a percentage. E.g. a 1TB drive shows 980GB usable and is weighted 0.980 where a 1TB drive shows 1960GB available and is weighted 1.960) once all OSD's are replaced as planned (Yes I know only one OSD should be changed at a time, but you can do up to the remaining free space in the cluster minus expected writes ( so save a couple OSD's worth of space) by properly draining the OSD's that are to be changed before removing them from the cluster.)

This is a very good way to prevent data loss (which is what we are trying to do at the same time as provide massive storage to our systems) and provide a method that allows us to use the system properly, as the designers of the underlying software intended, to swap out OSDs and keep out replication counts high and our rebuild times low.