Proper Shutdown of Cluster
merryweathertech
6 Posts
October 1, 2018, 9:00 pmQuote from merryweathertech on October 1, 2018, 9:00 pmHello - really loving PetaSAN! Is there a proper way to shutdown the cluster? Also, are there any plans to make a shutdown process within the GUI? Thanks! - Gordon
Hello - really loving PetaSAN! Is there a proper way to shutdown the cluster? Also, are there any plans to make a shutdown process within the GUI? Thanks! - Gordon
admin
2,930 Posts
October 2, 2018, 1:47 pmQuote from admin on October 2, 2018, 1:47 pmGenerally Ceph is designed to be always up and not for regular switching on/off.
If you do have a need to shutdown, i would recommend you make sure all client io have stopped then shut down all nodes. i would recommend the nodes get shutdown together in a relatively short period of time ( within 5 min or sooner ) so to avoid kicking in any state changes and recovery. Similarly boot all nodes within the same time span.
Generally Ceph is designed to be always up and not for regular switching on/off.
If you do have a need to shutdown, i would recommend you make sure all client io have stopped then shut down all nodes. i would recommend the nodes get shutdown together in a relatively short period of time ( within 5 min or sooner ) so to avoid kicking in any state changes and recovery. Similarly boot all nodes within the same time span.
Last edited on October 2, 2018, 1:48 pm by admin · #2
merryweathertech
6 Posts
October 4, 2018, 3:30 amQuote from merryweathertech on October 4, 2018, 3:30 amHi - I performed a full shutdown of my three physical servers (all 3 are monitors, iscsi, storage) and added a 10 GBE adapter to each one. I booted them up simultaneously, and after coming online, everything looked good. However, the 100TB demo iSCSI disk I created is missing now. Is there a way to see what happened? Can I use SSH to see if it is still there (the data that I put there still seems to be there because is shows the usage in the graph)? This is scares me because I worry about something like this happening in production. Thanks for your help. - Gordon
Hi - I performed a full shutdown of my three physical servers (all 3 are monitors, iscsi, storage) and added a 10 GBE adapter to each one. I booted them up simultaneously, and after coming online, everything looked good. However, the 100TB demo iSCSI disk I created is missing now. Is there a way to see what happened? Can I use SSH to see if it is still there (the data that I put there still seems to be there because is shows the usage in the graph)? This is scares me because I worry about something like this happening in production. Thanks for your help. - Gordon
Last edited on October 4, 2018, 3:34 am by merryweathertech · #3
merryweathertech
6 Posts
October 4, 2018, 8:09 amQuote from merryweathertech on October 4, 2018, 8:09 amHi - I tried to reproduce this by creating another 100TB disk and then performing the same steps as before, and it is still in the list. I did this a bunch of times, also not gracefully shutting down and even leaving the iscsi connection open with a locked file. I also brought the hosts up with a lot of lag time, and every time the cluster returned to healthy and the iscsi disk is still there. So my hunch is it had something to do with adding the 10GBe adapters after that first iscsi disk was created as mentioned above. What do you think? Thanks - Gordon
Hi - I tried to reproduce this by creating another 100TB disk and then performing the same steps as before, and it is still in the list. I did this a bunch of times, also not gracefully shutting down and even leaving the iscsi connection open with a locked file. I also brought the hosts up with a lot of lag time, and every time the cluster returned to healthy and the iscsi disk is still there. So my hunch is it had something to do with adding the 10GBe adapters after that first iscsi disk was created as mentioned above. What do you think? Thanks - Gordon
admin
2,930 Posts
October 4, 2018, 8:34 amQuote from admin on October 4, 2018, 8:34 amHave you added / deleted pools at all..deleting a pool will delete the disk. Apart from this as well as directly deleting the disk, i can not think of any reason. Ceph will not delete a disk from behind you.
Have you added / deleted pools at all..deleting a pool will delete the disk. Apart from this as well as directly deleting the disk, i can not think of any reason. Ceph will not delete a disk from behind you.
Last edited on October 4, 2018, 8:35 am by admin · #5
merryweathertech
6 Posts
October 4, 2018, 9:40 amQuote from merryweathertech on October 4, 2018, 9:40 amHi - no, never deleted anything. After your response to my first question, I went to the iscsi list and detached the disk, I shut down each node almost within 5 seconds of each other, then added the 10GBe adapters to each node, powered them all on at once, health was green within a minute, and I tried to connect to the disk from my test server and it hung. I opened the disk list and noticed it was gone. I think I am going to do a full reinstall, but without the 10GBe adapters again, and repeat this whole process again from start to finish. I will let you know if I reproduce it. Thanks - Gordon
Hi - no, never deleted anything. After your response to my first question, I went to the iscsi list and detached the disk, I shut down each node almost within 5 seconds of each other, then added the 10GBe adapters to each node, powered them all on at once, health was green within a minute, and I tried to connect to the disk from my test server and it hung. I opened the disk list and noticed it was gone. I think I am going to do a full reinstall, but without the 10GBe adapters again, and repeat this whole process again from start to finish. I will let you know if I reproduce it. Thanks - Gordon
admin
2,930 Posts
October 4, 2018, 10:02 amQuote from admin on October 4, 2018, 10:02 amThe command to show you available images in Ceph is :
rbd ls --cluster xxx
where xxx is the name you named your cluster
Something that may be related: if the cluster is not responding, which could happen temporary if you reboot the entire cluster, then the above command may hang until everything is responding..also the ui iSCSI list will be empty (it uses the above command), but once Ceph is responsive the image should be displayed via the command as well as within the ui.
The command to show you available images in Ceph is :
rbd ls --cluster xxx
where xxx is the name you named your cluster
Something that may be related: if the cluster is not responding, which could happen temporary if you reboot the entire cluster, then the above command may hang until everything is responding..also the ui iSCSI list will be empty (it uses the above command), but once Ceph is responsive the image should be displayed via the command as well as within the ui.
Proper Shutdown of Cluster
merryweathertech
6 Posts
Quote from merryweathertech on October 1, 2018, 9:00 pmHello - really loving PetaSAN! Is there a proper way to shutdown the cluster? Also, are there any plans to make a shutdown process within the GUI? Thanks! - Gordon
Hello - really loving PetaSAN! Is there a proper way to shutdown the cluster? Also, are there any plans to make a shutdown process within the GUI? Thanks! - Gordon
admin
2,930 Posts
Quote from admin on October 2, 2018, 1:47 pmGenerally Ceph is designed to be always up and not for regular switching on/off.
If you do have a need to shutdown, i would recommend you make sure all client io have stopped then shut down all nodes. i would recommend the nodes get shutdown together in a relatively short period of time ( within 5 min or sooner ) so to avoid kicking in any state changes and recovery. Similarly boot all nodes within the same time span.
Generally Ceph is designed to be always up and not for regular switching on/off.
If you do have a need to shutdown, i would recommend you make sure all client io have stopped then shut down all nodes. i would recommend the nodes get shutdown together in a relatively short period of time ( within 5 min or sooner ) so to avoid kicking in any state changes and recovery. Similarly boot all nodes within the same time span.
merryweathertech
6 Posts
Quote from merryweathertech on October 4, 2018, 3:30 amHi - I performed a full shutdown of my three physical servers (all 3 are monitors, iscsi, storage) and added a 10 GBE adapter to each one. I booted them up simultaneously, and after coming online, everything looked good. However, the 100TB demo iSCSI disk I created is missing now. Is there a way to see what happened? Can I use SSH to see if it is still there (the data that I put there still seems to be there because is shows the usage in the graph)? This is scares me because I worry about something like this happening in production. Thanks for your help. - Gordon
Hi - I performed a full shutdown of my three physical servers (all 3 are monitors, iscsi, storage) and added a 10 GBE adapter to each one. I booted them up simultaneously, and after coming online, everything looked good. However, the 100TB demo iSCSI disk I created is missing now. Is there a way to see what happened? Can I use SSH to see if it is still there (the data that I put there still seems to be there because is shows the usage in the graph)? This is scares me because I worry about something like this happening in production. Thanks for your help. - Gordon
merryweathertech
6 Posts
Quote from merryweathertech on October 4, 2018, 8:09 amHi - I tried to reproduce this by creating another 100TB disk and then performing the same steps as before, and it is still in the list. I did this a bunch of times, also not gracefully shutting down and even leaving the iscsi connection open with a locked file. I also brought the hosts up with a lot of lag time, and every time the cluster returned to healthy and the iscsi disk is still there. So my hunch is it had something to do with adding the 10GBe adapters after that first iscsi disk was created as mentioned above. What do you think? Thanks - Gordon
Hi - I tried to reproduce this by creating another 100TB disk and then performing the same steps as before, and it is still in the list. I did this a bunch of times, also not gracefully shutting down and even leaving the iscsi connection open with a locked file. I also brought the hosts up with a lot of lag time, and every time the cluster returned to healthy and the iscsi disk is still there. So my hunch is it had something to do with adding the 10GBe adapters after that first iscsi disk was created as mentioned above. What do you think? Thanks - Gordon
admin
2,930 Posts
Quote from admin on October 4, 2018, 8:34 amHave you added / deleted pools at all..deleting a pool will delete the disk. Apart from this as well as directly deleting the disk, i can not think of any reason. Ceph will not delete a disk from behind you.
Have you added / deleted pools at all..deleting a pool will delete the disk. Apart from this as well as directly deleting the disk, i can not think of any reason. Ceph will not delete a disk from behind you.
merryweathertech
6 Posts
Quote from merryweathertech on October 4, 2018, 9:40 amHi - no, never deleted anything. After your response to my first question, I went to the iscsi list and detached the disk, I shut down each node almost within 5 seconds of each other, then added the 10GBe adapters to each node, powered them all on at once, health was green within a minute, and I tried to connect to the disk from my test server and it hung. I opened the disk list and noticed it was gone. I think I am going to do a full reinstall, but without the 10GBe adapters again, and repeat this whole process again from start to finish. I will let you know if I reproduce it. Thanks - Gordon
Hi - no, never deleted anything. After your response to my first question, I went to the iscsi list and detached the disk, I shut down each node almost within 5 seconds of each other, then added the 10GBe adapters to each node, powered them all on at once, health was green within a minute, and I tried to connect to the disk from my test server and it hung. I opened the disk list and noticed it was gone. I think I am going to do a full reinstall, but without the 10GBe adapters again, and repeat this whole process again from start to finish. I will let you know if I reproduce it. Thanks - Gordon
admin
2,930 Posts
Quote from admin on October 4, 2018, 10:02 amThe command to show you available images in Ceph is :
rbd ls --cluster xxx
where xxx is the name you named your cluster
Something that may be related: if the cluster is not responding, which could happen temporary if you reboot the entire cluster, then the above command may hang until everything is responding..also the ui iSCSI list will be empty (it uses the above command), but once Ceph is responsive the image should be displayed via the command as well as within the ui.
The command to show you available images in Ceph is :
rbd ls --cluster xxx
where xxx is the name you named your cluster
Something that may be related: if the cluster is not responding, which could happen temporary if you reboot the entire cluster, then the above command may hang until everything is responding..also the ui iSCSI list will be empty (it uses the above command), but once Ceph is responsive the image should be displayed via the command as well as within the ui.