Increase Number of Replicas limit
therm
121 Posts
August 1, 2017, 6:43 amQuote from therm on August 1, 2017, 6:43 amPlease increase the limit of "Number of Replicas" to 4.
Background:
We are planning to expand our PetaSAN-Cluster to at least 6 Nodes, 3 in every server room and one additional monitoring node in a third room. In order to prevent outage of the storage service in case of powerloss in one room the ceph settings will be:
osd_pool_default_size = 4
osd_pool_default_min_size = 2
and a adjusted crush rule.
Just for curiosity, what happens if I change the settings by hand, leaving cluster settings "Number of Replicas" on "3"?
Regards,
Dennis
Please increase the limit of "Number of Replicas" to 4.
Background:
We are planning to expand our PetaSAN-Cluster to at least 6 Nodes, 3 in every server room and one additional monitoring node in a third room. In order to prevent outage of the storage service in case of powerloss in one room the ceph settings will be:
osd_pool_default_size = 4
osd_pool_default_min_size = 2
and a adjusted crush rule.
Just for curiosity, what happens if I change the settings by hand, leaving cluster settings "Number of Replicas" on "3"?
Regards,
Dennis
therm
121 Posts
August 1, 2017, 7:05 amQuote from therm on August 1, 2017, 7:05 amI found the place in the source:
vi /opt/petasan/services/web/templates/admin/configuration/cluster_settings.html
<!--Replicas-->
<div class="row">
<div class="col-md-4">
<div class="form-group">
<label id="lblReplicas"><i class=""></i> Number of Replicas</label>
<select class="form-control" name="replica_no" id="replica_no">
<option value="2" {% if form.replica_no=="2" %} selected="selected" {% endif %}>
2
</option>
<option value="3" {% if form.replica_no=="3" %} selected="selected" {% endif %}>
3
</option>
<option value="4" {% if form.replica_no=="4" %} selected="selected" {% endif %}>
4
</option>
</select>
</div>
</div>
</div>
Could you please commit it?
Regards, Dennis
I found the place in the source:
vi /opt/petasan/services/web/templates/admin/configuration/cluster_settings.html
<!--Replicas-->
<div class="row">
<div class="col-md-4">
<div class="form-group">
<label id="lblReplicas"><i class=""></i> Number of Replicas</label>
<select class="form-control" name="replica_no" id="replica_no">
<option value="2" {% if form.replica_no=="2" %} selected="selected" {% endif %}>
2
</option>
<option value="3" {% if form.replica_no=="3" %} selected="selected" {% endif %}>
3
</option>
<option value="4" {% if form.replica_no=="4" %} selected="selected" {% endif %}>
4
</option>
</select>
</div>
</div>
</div>
Could you please commit it?
Regards, Dennis
admin
2,921 Posts
August 1, 2017, 6:50 pmQuote from admin on August 1, 2017, 6:50 pmHi
You can definitely change the replica count by hand using commands. PetaSAN does not store this (or similar) value external to Ceph.
I will take your changes and request they be included in next release.
Note that you do not have to create a fourth replica to achieve what you want, it is possible to use 3 or even 2 replicas but define a custom crush map which controls how these replicas are placed, so for example to place 1 replica in each room.
We do have plans further down to support crush map editing: ie have a visual editor to define your racks/rooms so that Ceph will place the replicas in a more intelligent/safer way,
Hi
You can definitely change the replica count by hand using commands. PetaSAN does not store this (or similar) value external to Ceph.
I will take your changes and request they be included in next release.
Note that you do not have to create a fourth replica to achieve what you want, it is possible to use 3 or even 2 replicas but define a custom crush map which controls how these replicas are placed, so for example to place 1 replica in each room.
We do have plans further down to support crush map editing: ie have a visual editor to define your racks/rooms so that Ceph will place the replicas in a more intelligent/safer way,
Last edited on August 1, 2017, 6:51 pm · #3
therm
121 Posts
August 2, 2017, 5:25 amQuote from therm on August 2, 2017, 5:25 amHi,
thanks for your response. I have read a lot that it`s dangerous to use min_size=1 (because there might be no other copy to compare).
Lets assume the second room has no power: If I split the 3 copy this would mean that one copy would be in one room and the other two are in a different room. If the second room is the one with the two copies, there will be only one copy left. With min_size=2, the system will be read_only. In other words filesystems will freeze. With min_size=1 filesystems will work, but any additional problem (bit flip, failed disk...) will lead to a total data loss or at least to data corruption.
My solution would be to invest more money and to split 4 replicas into chunks of 2. So every room would have two copies. With min_size=2 we would only run into a problem when one room is offline and a additional problem happens, but we would not loose data in this case.
Am I right or wrong?
Regards,
Dennis
Hi,
thanks for your response. I have read a lot that it`s dangerous to use min_size=1 (because there might be no other copy to compare).
Lets assume the second room has no power: If I split the 3 copy this would mean that one copy would be in one room and the other two are in a different room. If the second room is the one with the two copies, there will be only one copy left. With min_size=2, the system will be read_only. In other words filesystems will freeze. With min_size=1 filesystems will work, but any additional problem (bit flip, failed disk...) will lead to a total data loss or at least to data corruption.
My solution would be to invest more money and to split 4 replicas into chunks of 2. So every room would have two copies. With min_size=2 we would only run into a problem when one room is offline and a additional problem happens, but we would not loose data in this case.
Am I right or wrong?
Regards,
Dennis
admin
2,921 Posts
August 2, 2017, 7:14 amQuote from admin on August 2, 2017, 7:14 amYes your solution looks good.
You will still need to customize the crush map so you do not have some PGs with 3 replicas stored in 3 hosts of the same room. Add a "room" bucket under thee default root bucket and above the host bucket, and add a rule which looks like:
min_size 2
max_size 4
step take default
step choose firstn 2 type room
step chooseleaf firstn 2 type host
step emit
or if you intend more than 3 rooms
min_size 2
max_size 4
step take default
step choose firstn 0 type room
step chooseleaf firstn 2 type host
step emit
Good luck
Yes your solution looks good.
You will still need to customize the crush map so you do not have some PGs with 3 replicas stored in 3 hosts of the same room. Add a "room" bucket under thee default root bucket and above the host bucket, and add a rule which looks like:
min_size 2
max_size 4
step take default
step choose firstn 2 type room
step chooseleaf firstn 2 type host
step emit
or if you intend more than 3 rooms
min_size 2
max_size 4
step take default
step choose firstn 0 type room
step chooseleaf firstn 2 type host
step emit
Good luck
Last edited on August 2, 2017, 7:17 am · #5
Increase Number of Replicas limit
therm
121 Posts
Quote from therm on August 1, 2017, 6:43 amPlease increase the limit of "Number of Replicas" to 4.
Background:
We are planning to expand our PetaSAN-Cluster to at least 6 Nodes, 3 in every server room and one additional monitoring node in a third room. In order to prevent outage of the storage service in case of powerloss in one room the ceph settings will be:
osd_pool_default_size = 4
osd_pool_default_min_size = 2and a adjusted crush rule.
Just for curiosity, what happens if I change the settings by hand, leaving cluster settings "Number of Replicas" on "3"?
Regards,
Dennis
Please increase the limit of "Number of Replicas" to 4.
Background:
We are planning to expand our PetaSAN-Cluster to at least 6 Nodes, 3 in every server room and one additional monitoring node in a third room. In order to prevent outage of the storage service in case of powerloss in one room the ceph settings will be:
osd_pool_default_size = 4
osd_pool_default_min_size = 2
and a adjusted crush rule.
Just for curiosity, what happens if I change the settings by hand, leaving cluster settings "Number of Replicas" on "3"?
Regards,
Dennis
therm
121 Posts
Quote from therm on August 1, 2017, 7:05 amI found the place in the source:
vi /opt/petasan/services/web/templates/admin/configuration/cluster_settings.html
<!--Replicas-->
<div class="row">
<div class="col-md-4">
<div class="form-group">
<label id="lblReplicas"><i class=""></i> Number of Replicas</label>
<select class="form-control" name="replica_no" id="replica_no">
<option value="2" {% if form.replica_no=="2" %} selected="selected" {% endif %}>
2
</option>
<option value="3" {% if form.replica_no=="3" %} selected="selected" {% endif %}>
3
</option>
<option value="4" {% if form.replica_no=="4" %} selected="selected" {% endif %}>
4
</option>
</select>
</div>
</div>
</div>Could you please commit it?
Regards, Dennis
I found the place in the source:
vi /opt/petasan/services/web/templates/admin/configuration/cluster_settings.html
<!--Replicas-->
<div class="row">
<div class="col-md-4">
<div class="form-group">
<label id="lblReplicas"><i class=""></i> Number of Replicas</label>
<select class="form-control" name="replica_no" id="replica_no">
<option value="2" {% if form.replica_no=="2" %} selected="selected" {% endif %}>
2
</option>
<option value="3" {% if form.replica_no=="3" %} selected="selected" {% endif %}>
3
</option>
<option value="4" {% if form.replica_no=="4" %} selected="selected" {% endif %}>
4
</option>
</select>
</div>
</div>
</div>
Could you please commit it?
Regards, Dennis
admin
2,921 Posts
Quote from admin on August 1, 2017, 6:50 pmHi
You can definitely change the replica count by hand using commands. PetaSAN does not store this (or similar) value external to Ceph.
I will take your changes and request they be included in next release.
Note that you do not have to create a fourth replica to achieve what you want, it is possible to use 3 or even 2 replicas but define a custom crush map which controls how these replicas are placed, so for example to place 1 replica in each room.
We do have plans further down to support crush map editing: ie have a visual editor to define your racks/rooms so that Ceph will place the replicas in a more intelligent/safer way,
Hi
You can definitely change the replica count by hand using commands. PetaSAN does not store this (or similar) value external to Ceph.
I will take your changes and request they be included in next release.
Note that you do not have to create a fourth replica to achieve what you want, it is possible to use 3 or even 2 replicas but define a custom crush map which controls how these replicas are placed, so for example to place 1 replica in each room.
We do have plans further down to support crush map editing: ie have a visual editor to define your racks/rooms so that Ceph will place the replicas in a more intelligent/safer way,
therm
121 Posts
Quote from therm on August 2, 2017, 5:25 amHi,
thanks for your response. I have read a lot that it`s dangerous to use min_size=1 (because there might be no other copy to compare).
Lets assume the second room has no power: If I split the 3 copy this would mean that one copy would be in one room and the other two are in a different room. If the second room is the one with the two copies, there will be only one copy left. With min_size=2, the system will be read_only. In other words filesystems will freeze. With min_size=1 filesystems will work, but any additional problem (bit flip, failed disk...) will lead to a total data loss or at least to data corruption.
My solution would be to invest more money and to split 4 replicas into chunks of 2. So every room would have two copies. With min_size=2 we would only run into a problem when one room is offline and a additional problem happens, but we would not loose data in this case.
Am I right or wrong?
Regards,
Dennis
Hi,
thanks for your response. I have read a lot that it`s dangerous to use min_size=1 (because there might be no other copy to compare).
Lets assume the second room has no power: If I split the 3 copy this would mean that one copy would be in one room and the other two are in a different room. If the second room is the one with the two copies, there will be only one copy left. With min_size=2, the system will be read_only. In other words filesystems will freeze. With min_size=1 filesystems will work, but any additional problem (bit flip, failed disk...) will lead to a total data loss or at least to data corruption.
My solution would be to invest more money and to split 4 replicas into chunks of 2. So every room would have two copies. With min_size=2 we would only run into a problem when one room is offline and a additional problem happens, but we would not loose data in this case.
Am I right or wrong?
Regards,
Dennis
admin
2,921 Posts
Quote from admin on August 2, 2017, 7:14 amYes your solution looks good.
You will still need to customize the crush map so you do not have some PGs with 3 replicas stored in 3 hosts of the same room. Add a "room" bucket under thee default root bucket and above the host bucket, and add a rule which looks like:
min_size 2
max_size 4
step take default
step choose firstn 2 type room
step chooseleaf firstn 2 type host
step emit
or if you intend more than 3 rooms
min_size 2
max_size 4
step take default
step choose firstn 0 type room
step chooseleaf firstn 2 type host
step emit
Good luck
Yes your solution looks good.
You will still need to customize the crush map so you do not have some PGs with 3 replicas stored in 3 hosts of the same room. Add a "room" bucket under thee default root bucket and above the host bucket, and add a rule which looks like:
min_size 2
max_size 4
step take default
step choose firstn 2 type room
step chooseleaf firstn 2 type host
step emit
or if you intend more than 3 rooms
min_size 2
max_size 4
step take default
step choose firstn 0 type room
step chooseleaf firstn 2 type host
step emit
Good luck