Adding available space on cluster with differing drive sizes
southcoast
50 Posts
November 8, 2018, 12:19 amQuote from southcoast on November 8, 2018, 12:19 amThe 3 Dell servers in my cluster have two drives each with 2 of the Dells (nodes 1 and 3) having each a pair of 136gb drives and the 3rd server (node 2) having a 136gb drive and a 1tb drive.
How would I make available the single 1tb drive as a new location to store data?
The 3 Dell servers in my cluster have two drives each with 2 of the Dells (nodes 1 and 3) having each a pair of 136gb drives and the 3rd server (node 2) having a 136gb drive and a 1tb drive.
How would I make available the single 1tb drive as a new location to store data?
admin
2,930 Posts
November 8, 2018, 8:48 amQuote from admin on November 8, 2018, 8:48 amAdding a disk as an OSD makes is available for storage.
Adding a disk as an OSD makes is available for storage.
southcoast
50 Posts
November 8, 2018, 1:39 pmQuote from southcoast on November 8, 2018, 1:39 pmIf I have a disk on one node larger than the others, will Petasan bias the storage to that larger volume? Or, is the aggregate value of the storage limited by the size of the smallest data volume in the cluster?
I was creating iSCSI disk volumes for my VMware hosts at 100mb per logical volume and at about the 4th volume, the cluster became inoperable. After rebooting nodes and restoring access, I was able to restore access to the nodes and even then access to "flapping". Once I stopped then deleted two or three of the iSCSI disks, stability was restored. I expect if i replace my 136gb data volumes with at least 500gb disks, matters would be better.
Please advise.
Thank-you
If I have a disk on one node larger than the others, will Petasan bias the storage to that larger volume? Or, is the aggregate value of the storage limited by the size of the smallest data volume in the cluster?
I was creating iSCSI disk volumes for my VMware hosts at 100mb per logical volume and at about the 4th volume, the cluster became inoperable. After rebooting nodes and restoring access, I was able to restore access to the nodes and even then access to "flapping". Once I stopped then deleted two or three of the iSCSI disks, stability was restored. I expect if i replace my 136gb data volumes with at least 500gb disks, matters would be better.
Please advise.
Thank-you
admin
2,930 Posts
November 8, 2018, 4:02 pmQuote from admin on November 8, 2018, 4:02 pmThe distribution of storage across disks and nodes is indeed weighted according to storage capacity. However its a weight that is multiplied by a probabilistic distribution/hash function..what this means is that the distribution is close to the ideal weighted but not exact. The more disks and symmetry you have the more accurate it will to the exact weight. In your case with only 1 osd per node and large differences in weights among them, it is possible the distribution function will cause enough variance to make a disk full ( it will stop if it is 95% full ).
So in your case, first try to make the disks close in size as possible, then try to have more than 1 osd per node.
The distribution of storage across disks and nodes is indeed weighted according to storage capacity. However its a weight that is multiplied by a probabilistic distribution/hash function..what this means is that the distribution is close to the ideal weighted but not exact. The more disks and symmetry you have the more accurate it will to the exact weight. In your case with only 1 osd per node and large differences in weights among them, it is possible the distribution function will cause enough variance to make a disk full ( it will stop if it is 95% full ).
So in your case, first try to make the disks close in size as possible, then try to have more than 1 osd per node.
southcoast
50 Posts
November 8, 2018, 4:45 pmQuote from southcoast on November 8, 2018, 4:45 pmThanks, I suspected the wide disparity in disk sizes would manifest some disruption to operation. I will see about obtaining data disks a bit larger and closer to the size of the one 1tb spindle.
Can you please provide the proper procedure for disk replacement? I am expecting would do this one at a time on each node for the 136gb disk to be replaced.
Thank-you
Thanks, I suspected the wide disparity in disk sizes would manifest some disruption to operation. I will see about obtaining data disks a bit larger and closer to the size of the one 1tb spindle.
Can you please provide the proper procedure for disk replacement? I am expecting would do this one at a time on each node for the 136gb disk to be replaced.
Thank-you
admin
2,930 Posts
November 8, 2018, 5:01 pmQuote from admin on November 8, 2018, 5:01 pmAdding new OSDs is trivial, the node will list all new disks and you just click on the add button.
Removing OSDs, we do allow removing running OSDs from the ui, you need to stop it manually:
systemctl stop ceph-osd@OSD_ID
Then wait for the cluster health to be OK active/clean then delete the stopped OSD from ui
Do this one OSD at a time.
Adding new OSDs is trivial, the node will list all new disks and you just click on the add button.
Removing OSDs, we do allow removing running OSDs from the ui, you need to stop it manually:
systemctl stop ceph-osd@OSD_ID
Then wait for the cluster health to be OK active/clean then delete the stopped OSD from ui
Do this one OSD at a time.
Last edited on November 8, 2018, 5:02 pm by admin · #6
southcoast
50 Posts
November 14, 2018, 1:59 amQuote from southcoast on November 14, 2018, 1:59 amI have set my data disks to "roughly" comparable sizes of 500gb, 1tb and 500gb on my nodes 1,2 and 3 respectively.
I deleted the old OSD and started the new OSD well enough.
As I resized my iSCSI disk sizes to take advantage of the larger data spindles, the UI went offline and no longer responded. The ssh access does not work either. Although from the network switch, the management IP on each server does answer a ping.
What are performance or server load considerations when I am changing the size of each node? In my case I was adjusting iSCSI volumes from 20gb to 50gb. The 1st adjustment went okey, but, when I made the same adjustment on the next iSCSI volume, that is when my UI access stopped.
I have set my data disks to "roughly" comparable sizes of 500gb, 1tb and 500gb on my nodes 1,2 and 3 respectively.
I deleted the old OSD and started the new OSD well enough.
As I resized my iSCSI disk sizes to take advantage of the larger data spindles, the UI went offline and no longer responded. The ssh access does not work either. Although from the network switch, the management IP on each server does answer a ping.
What are performance or server load considerations when I am changing the size of each node? In my case I was adjusting iSCSI volumes from 20gb to 50gb. The 1st adjustment went okey, but, when I made the same adjustment on the next iSCSI volume, that is when my UI access stopped.
admin
2,930 Posts
November 14, 2018, 11:36 amQuote from admin on November 14, 2018, 11:36 ammost likely it is not the iSCSI disk resize that caused the freeze but at the Ceph layer, deleting and adding OSDs. The recovery and rebalance operations could stress your system. You could look at the charts and see your cpu, ram , disk % util history and maybe they show a peek.
The hardware we recommend is in our hardware guide.
most likely it is not the iSCSI disk resize that caused the freeze but at the Ceph layer, deleting and adding OSDs. The recovery and rebalance operations could stress your system. You could look at the charts and see your cpu, ram , disk % util history and maybe they show a peek.
The hardware we recommend is in our hardware guide.
southcoast
50 Posts
November 14, 2018, 1:15 pmQuote from southcoast on November 14, 2018, 1:15 pmI suspect my Dell servers with 16gb of RAM and two disks is at the minimum recommendations. I left last night with all three nodes in the cluster up but found this morning one (node 2) of the three nodes down. It is not even accessible via local SSH access. I will need to investigate when I am onsite this afternoon.
I suspect my Dell servers with 16gb of RAM and two disks is at the minimum recommendations. I left last night with all three nodes in the cluster up but found this morning one (node 2) of the three nodes down. It is not even accessible via local SSH access. I will need to investigate when I am onsite this afternoon.
Adding available space on cluster with differing drive sizes
southcoast
50 Posts
Quote from southcoast on November 8, 2018, 12:19 amThe 3 Dell servers in my cluster have two drives each with 2 of the Dells (nodes 1 and 3) having each a pair of 136gb drives and the 3rd server (node 2) having a 136gb drive and a 1tb drive.
How would I make available the single 1tb drive as a new location to store data?
The 3 Dell servers in my cluster have two drives each with 2 of the Dells (nodes 1 and 3) having each a pair of 136gb drives and the 3rd server (node 2) having a 136gb drive and a 1tb drive.
How would I make available the single 1tb drive as a new location to store data?
admin
2,930 Posts
Quote from admin on November 8, 2018, 8:48 amAdding a disk as an OSD makes is available for storage.
Adding a disk as an OSD makes is available for storage.
southcoast
50 Posts
Quote from southcoast on November 8, 2018, 1:39 pmIf I have a disk on one node larger than the others, will Petasan bias the storage to that larger volume? Or, is the aggregate value of the storage limited by the size of the smallest data volume in the cluster?
I was creating iSCSI disk volumes for my VMware hosts at 100mb per logical volume and at about the 4th volume, the cluster became inoperable. After rebooting nodes and restoring access, I was able to restore access to the nodes and even then access to "flapping". Once I stopped then deleted two or three of the iSCSI disks, stability was restored. I expect if i replace my 136gb data volumes with at least 500gb disks, matters would be better.
Please advise.
Thank-you
If I have a disk on one node larger than the others, will Petasan bias the storage to that larger volume? Or, is the aggregate value of the storage limited by the size of the smallest data volume in the cluster?
I was creating iSCSI disk volumes for my VMware hosts at 100mb per logical volume and at about the 4th volume, the cluster became inoperable. After rebooting nodes and restoring access, I was able to restore access to the nodes and even then access to "flapping". Once I stopped then deleted two or three of the iSCSI disks, stability was restored. I expect if i replace my 136gb data volumes with at least 500gb disks, matters would be better.
Please advise.
Thank-you
admin
2,930 Posts
Quote from admin on November 8, 2018, 4:02 pmThe distribution of storage across disks and nodes is indeed weighted according to storage capacity. However its a weight that is multiplied by a probabilistic distribution/hash function..what this means is that the distribution is close to the ideal weighted but not exact. The more disks and symmetry you have the more accurate it will to the exact weight. In your case with only 1 osd per node and large differences in weights among them, it is possible the distribution function will cause enough variance to make a disk full ( it will stop if it is 95% full ).
So in your case, first try to make the disks close in size as possible, then try to have more than 1 osd per node.
The distribution of storage across disks and nodes is indeed weighted according to storage capacity. However its a weight that is multiplied by a probabilistic distribution/hash function..what this means is that the distribution is close to the ideal weighted but not exact. The more disks and symmetry you have the more accurate it will to the exact weight. In your case with only 1 osd per node and large differences in weights among them, it is possible the distribution function will cause enough variance to make a disk full ( it will stop if it is 95% full ).
So in your case, first try to make the disks close in size as possible, then try to have more than 1 osd per node.
southcoast
50 Posts
Quote from southcoast on November 8, 2018, 4:45 pmThanks, I suspected the wide disparity in disk sizes would manifest some disruption to operation. I will see about obtaining data disks a bit larger and closer to the size of the one 1tb spindle.
Can you please provide the proper procedure for disk replacement? I am expecting would do this one at a time on each node for the 136gb disk to be replaced.
Thank-you
Thanks, I suspected the wide disparity in disk sizes would manifest some disruption to operation. I will see about obtaining data disks a bit larger and closer to the size of the one 1tb spindle.
Can you please provide the proper procedure for disk replacement? I am expecting would do this one at a time on each node for the 136gb disk to be replaced.
Thank-you
admin
2,930 Posts
Quote from admin on November 8, 2018, 5:01 pmAdding new OSDs is trivial, the node will list all new disks and you just click on the add button.
Removing OSDs, we do allow removing running OSDs from the ui, you need to stop it manually:
systemctl stop ceph-osd@OSD_ID
Then wait for the cluster health to be OK active/clean then delete the stopped OSD from ui
Do this one OSD at a time.
Adding new OSDs is trivial, the node will list all new disks and you just click on the add button.
Removing OSDs, we do allow removing running OSDs from the ui, you need to stop it manually:
systemctl stop ceph-osd@OSD_ID
Then wait for the cluster health to be OK active/clean then delete the stopped OSD from ui
Do this one OSD at a time.
southcoast
50 Posts
Quote from southcoast on November 14, 2018, 1:59 amI have set my data disks to "roughly" comparable sizes of 500gb, 1tb and 500gb on my nodes 1,2 and 3 respectively.
I deleted the old OSD and started the new OSD well enough.
As I resized my iSCSI disk sizes to take advantage of the larger data spindles, the UI went offline and no longer responded. The ssh access does not work either. Although from the network switch, the management IP on each server does answer a ping.
What are performance or server load considerations when I am changing the size of each node? In my case I was adjusting iSCSI volumes from 20gb to 50gb. The 1st adjustment went okey, but, when I made the same adjustment on the next iSCSI volume, that is when my UI access stopped.
I have set my data disks to "roughly" comparable sizes of 500gb, 1tb and 500gb on my nodes 1,2 and 3 respectively.
I deleted the old OSD and started the new OSD well enough.
As I resized my iSCSI disk sizes to take advantage of the larger data spindles, the UI went offline and no longer responded. The ssh access does not work either. Although from the network switch, the management IP on each server does answer a ping.
What are performance or server load considerations when I am changing the size of each node? In my case I was adjusting iSCSI volumes from 20gb to 50gb. The 1st adjustment went okey, but, when I made the same adjustment on the next iSCSI volume, that is when my UI access stopped.
admin
2,930 Posts
Quote from admin on November 14, 2018, 11:36 ammost likely it is not the iSCSI disk resize that caused the freeze but at the Ceph layer, deleting and adding OSDs. The recovery and rebalance operations could stress your system. You could look at the charts and see your cpu, ram , disk % util history and maybe they show a peek.
The hardware we recommend is in our hardware guide.
most likely it is not the iSCSI disk resize that caused the freeze but at the Ceph layer, deleting and adding OSDs. The recovery and rebalance operations could stress your system. You could look at the charts and see your cpu, ram , disk % util history and maybe they show a peek.
The hardware we recommend is in our hardware guide.
southcoast
50 Posts
Quote from southcoast on November 14, 2018, 1:15 pmI suspect my Dell servers with 16gb of RAM and two disks is at the minimum recommendations. I left last night with all three nodes in the cluster up but found this morning one (node 2) of the three nodes down. It is not even accessible via local SSH access. I will need to investigate when I am onsite this afternoon.
I suspect my Dell servers with 16gb of RAM and two disks is at the minimum recommendations. I left last night with all three nodes in the cluster up but found this morning one (node 2) of the three nodes down. It is not even accessible via local SSH access. I will need to investigate when I am onsite this afternoon.