Understanding/Using Pools
Pages: 1 2
protocol6v
85 Posts
January 15, 2019, 2:55 pmQuote from protocol6v on January 15, 2019, 2:55 pmJust starting to play with 2.2, trying to understand the best use of pools.
I have an existing hard drive based cluster of 4 hosts each with 24 spinners and I have 4 additional hosts I'm looking to spin up, using SSDs and/or mixed 15k drives. Would I be best suited to add these new hosts to the existing cluster, and using pools or ec rules to place iSCSI disks on the SSD or faster spinners? Or should I create a whole new deployment?
Thanks!
Just starting to play with 2.2, trying to understand the best use of pools.
I have an existing hard drive based cluster of 4 hosts each with 24 spinners and I have 4 additional hosts I'm looking to spin up, using SSDs and/or mixed 15k drives. Would I be best suited to add these new hosts to the existing cluster, and using pools or ec rules to place iSCSI disks on the SSD or faster spinners? Or should I create a whole new deployment?
Thanks!
admin
2,930 Posts
January 15, 2019, 3:19 pmQuote from admin on January 15, 2019, 3:19 pmMaybe add a new cluster. The thing is even if you define new crush rules for your new disks, the current crush rule of existing rbd pool "replicated_rule" will happily use the new disks as well and they will be serving both pools. You can change the rule for the existing pool but this will cause re-moving most of the existing data, it is really up to you.
Maybe add a new cluster. The thing is even if you define new crush rules for your new disks, the current crush rule of existing rbd pool "replicated_rule" will happily use the new disks as well and they will be serving both pools. You can change the rule for the existing pool but this will cause re-moving most of the existing data, it is really up to you.
protocol6v
85 Posts
January 15, 2019, 3:32 pmQuote from protocol6v on January 15, 2019, 3:32 pmTo move forward, would I be best to get the burden of moving the data out of the way so I don't end up with a bunch of clusters to manage?
What combination of pools, rules, buckets would I need to effectively keep one iscsi disk on 7200, one on 15k and one on ssd?
Thanks!
To move forward, would I be best to get the burden of moving the data out of the way so I don't end up with a bunch of clusters to manage?
What combination of pools, rules, buckets would I need to effectively keep one iscsi disk on 7200, one on 15k and one on ssd?
Thanks!
admin
2,930 Posts
January 15, 2019, 3:58 pmQuote from admin on January 15, 2019, 3:58 pmThe pre-defined template rule allow you to differentiate between ssds and hdds, all you have to do is use define pools with these rules.
To define different pools for hdds of different speeds, it becomes a bit more complex, One way is to place the different disk types in different hosts and create a rack (or row/room) called rack_15k and rack_7k to place nodes under, then you will have to write your won custom rules to do data placement on this upper rack. There is also another method to define your own device type apart from hdd and ssd,but i would not recommend it and it will probably mess with PetaSAN.
The pre-defined template rule allow you to differentiate between ssds and hdds, all you have to do is use define pools with these rules.
To define different pools for hdds of different speeds, it becomes a bit more complex, One way is to place the different disk types in different hosts and create a rack (or row/room) called rack_15k and rack_7k to place nodes under, then you will have to write your won custom rules to do data placement on this upper rack. There is also another method to define your own device type apart from hdd and ssd,but i would not recommend it and it will probably mess with PetaSAN.
alienn
37 Posts
November 6, 2019, 4:23 pmQuote from alienn on November 6, 2019, 4:23 pm
Quote from admin on January 15, 2019, 3:19 pm
Maybe add a new cluster. The thing is even if you define new crush rules for your new disks, the current crush rule of existing rbd pool "replicated_rule" will happily use the new disks as well and they will be serving both pools. You can change the rule for the existing pool but this will cause re-moving most of the existing data, it is really up to you.
Sorry for reviving this old thread...
Right now I have one pool using the replicated_rule. Right now only hdds are used for the pool.
When I change the rule from replicated_rule to by-host-hdd rule (created from the template without modification) will there really be a lot of data movement? De facto nothing would change for the existing data as they are already on the only available hdds... 🙂 Or did I miss something?
I plan on adding some ssds to the nodes and create a second pool using the rule by-host-ssd for faster storage.
Can I create a rule that only targets nvme ssds? What would be the criteria?
Quote from admin on January 15, 2019, 3:19 pm
Maybe add a new cluster. The thing is even if you define new crush rules for your new disks, the current crush rule of existing rbd pool "replicated_rule" will happily use the new disks as well and they will be serving both pools. You can change the rule for the existing pool but this will cause re-moving most of the existing data, it is really up to you.
Sorry for reviving this old thread...
Right now I have one pool using the replicated_rule. Right now only hdds are used for the pool.
When I change the rule from replicated_rule to by-host-hdd rule (created from the template without modification) will there really be a lot of data movement? De facto nothing would change for the existing data as they are already on the only available hdds... 🙂 Or did I miss something?
I plan on adding some ssds to the nodes and create a second pool using the rule by-host-ssd for faster storage.
Can I create a rule that only targets nvme ssds? What would be the criteria?
Last edited on November 6, 2019, 4:24 pm by alienn · #5
admin
2,930 Posts
November 6, 2019, 8:53 pmQuote from admin on November 6, 2019, 8:53 pmby default Ceph does not classify nvme as ssds, if you want to separate them you need to tag the devices with a custom device class you create which you can name nvme.
to re-define a new crush tree with preserving existing device ids, see
https://docs.ceph.com/docs/master/rados/operations/crush-map-edits/#crush-reclassify
by default Ceph does not classify nvme as ssds, if you want to separate them you need to tag the devices with a custom device class you create which you can name nvme.
to re-define a new crush tree with preserving existing device ids, see
https://docs.ceph.com/docs/master/rados/operations/crush-map-edits/#crush-reclassify
alienn
37 Posts
November 20, 2019, 10:42 amQuote from alienn on November 20, 2019, 10:42 amAccording to the link you posted I'll be doing the following command on one of the three nodes:
ceph osd getcrushmap -o original # Export existing crushmap
crushtool -i original --reclassify --reclassify-root default hdd -o adjusted # Move all esxisting osds to device class hdd
crushtool -i original --compare adjusted # compare existing and new crushmap before activating
ceph osd setcrushmap -i adjusted # set new crushmap
I fail at the stept where it should modify the crushmap. Here is what I get:
$ crushtool -i original --reclassify --reclassify-root default hdd -o adjusted
classify_root default (-1) as hdd
rule 1 includes take on root default class 0
failed to reclassify map
This is my existing crush tree (ceph osd crush tree --show-shadow):
ID CLASS WEIGHT TYPE NAME
-12 ssd 0 root default~ssd
-10 ssd 0 host ceph01~ssd
-11 ssd 0 host ceph02~ssd
-9 ssd 0 host ceph03~ssd
-2 hdd 218.30383 root default~hdd
-6 hdd 72.76794 host ceph01~hdd
8 hdd 9.09599 osd.8
9 hdd 9.09599 osd.9
10 hdd 9.09599 osd.10
11 hdd 9.09599 osd.11
12 hdd 9.09599 osd.12
13 hdd 9.09599 osd.13
14 hdd 9.09599 osd.14
15 hdd 9.09599 osd.15
-8 hdd 72.76794 host ceph02~hdd
16 hdd 9.09599 osd.16
17 hdd 9.09599 osd.17
18 hdd 9.09599 osd.18
19 hdd 9.09599 osd.19
21 hdd 9.09599 osd.21
22 hdd 9.09599 osd.22
23 hdd 9.09599 osd.23
24 hdd 9.09599 osd.24
-4 hdd 72.76794 host ceph03~hdd
0 hdd 9.09599 osd.0
1 hdd 9.09599 osd.1
2 hdd 9.09599 osd.2
3 hdd 9.09599 osd.3
4 hdd 9.09599 osd.4
5 hdd 9.09599 osd.5
6 hdd 9.09599 osd.6
7 hdd 9.09599 osd.7
-1 218.30383 root default
-5 72.76794 host ceph01
8 hdd 9.09599 osd.8
9 hdd 9.09599 osd.9
10 hdd 9.09599 osd.10
11 hdd 9.09599 osd.11
12 hdd 9.09599 osd.12
13 hdd 9.09599 osd.13
14 hdd 9.09599 osd.14
15 hdd 9.09599 osd.15
-7 72.76794 host ceph02
16 hdd 9.09599 osd.16
17 hdd 9.09599 osd.17
18 hdd 9.09599 osd.18
19 hdd 9.09599 osd.19
21 hdd 9.09599 osd.21
22 hdd 9.09599 osd.22
23 hdd 9.09599 osd.23
24 hdd 9.09599 osd.24
-3 72.76794 host ceph03
0 hdd 9.09599 osd.0
1 hdd 9.09599 osd.1
2 hdd 9.09599 osd.2
3 hdd 9.09599 osd.3
4 hdd 9.09599 osd.4
5 hdd 9.09599 osd.5
6 hdd 9.09599 osd.6
7 hdd 9.09599 osd.7
I just want to move my one and only existing pool (ceph-replicated-3) from the default rule "replicated_rule" to the rule "by-host-hdd". And I hope I can achieve this without too much data movement. As far as may research show I can could do `ceph osd pool set ceph-replicated-3 crush_rule by-host-hhd`
but that would move a lot of data unnecessarily arround, right?
According to the link you posted I'll be doing the following command on one of the three nodes:
ceph osd getcrushmap -o original # Export existing crushmap
crushtool -i original --reclassify --reclassify-root default hdd -o adjusted # Move all esxisting osds to device class hdd
crushtool -i original --compare adjusted # compare existing and new crushmap before activating
ceph osd setcrushmap -i adjusted # set new crushmap
I fail at the stept where it should modify the crushmap. Here is what I get:
$ crushtool -i original --reclassify --reclassify-root default hdd -o adjusted
classify_root default (-1) as hdd
rule 1 includes take on root default class 0
failed to reclassify map
This is my existing crush tree (ceph osd crush tree --show-shadow):
ID CLASS WEIGHT TYPE NAME
-12 ssd 0 root default~ssd
-10 ssd 0 host ceph01~ssd
-11 ssd 0 host ceph02~ssd
-9 ssd 0 host ceph03~ssd
-2 hdd 218.30383 root default~hdd
-6 hdd 72.76794 host ceph01~hdd
8 hdd 9.09599 osd.8
9 hdd 9.09599 osd.9
10 hdd 9.09599 osd.10
11 hdd 9.09599 osd.11
12 hdd 9.09599 osd.12
13 hdd 9.09599 osd.13
14 hdd 9.09599 osd.14
15 hdd 9.09599 osd.15
-8 hdd 72.76794 host ceph02~hdd
16 hdd 9.09599 osd.16
17 hdd 9.09599 osd.17
18 hdd 9.09599 osd.18
19 hdd 9.09599 osd.19
21 hdd 9.09599 osd.21
22 hdd 9.09599 osd.22
23 hdd 9.09599 osd.23
24 hdd 9.09599 osd.24
-4 hdd 72.76794 host ceph03~hdd
0 hdd 9.09599 osd.0
1 hdd 9.09599 osd.1
2 hdd 9.09599 osd.2
3 hdd 9.09599 osd.3
4 hdd 9.09599 osd.4
5 hdd 9.09599 osd.5
6 hdd 9.09599 osd.6
7 hdd 9.09599 osd.7
-1 218.30383 root default
-5 72.76794 host ceph01
8 hdd 9.09599 osd.8
9 hdd 9.09599 osd.9
10 hdd 9.09599 osd.10
11 hdd 9.09599 osd.11
12 hdd 9.09599 osd.12
13 hdd 9.09599 osd.13
14 hdd 9.09599 osd.14
15 hdd 9.09599 osd.15
-7 72.76794 host ceph02
16 hdd 9.09599 osd.16
17 hdd 9.09599 osd.17
18 hdd 9.09599 osd.18
19 hdd 9.09599 osd.19
21 hdd 9.09599 osd.21
22 hdd 9.09599 osd.22
23 hdd 9.09599 osd.23
24 hdd 9.09599 osd.24
-3 72.76794 host ceph03
0 hdd 9.09599 osd.0
1 hdd 9.09599 osd.1
2 hdd 9.09599 osd.2
3 hdd 9.09599 osd.3
4 hdd 9.09599 osd.4
5 hdd 9.09599 osd.5
6 hdd 9.09599 osd.6
7 hdd 9.09599 osd.7
I just want to move my one and only existing pool (ceph-replicated-3) from the default rule "replicated_rule" to the rule "by-host-hdd". And I hope I can achieve this without too much data movement. As far as may research show I can could do `ceph osd pool set ceph-replicated-3 crush_rule by-host-hhd`
but that would move a lot of data unnecessarily arround, right?
admin
2,930 Posts
November 20, 2019, 9:38 pmQuote from admin on November 20, 2019, 9:38 pmcan you decompile your original crush and post the text here
can you decompile your original crush and post the text here
Last edited on November 20, 2019, 9:39 pm by admin · #8
alienn
37 Posts
November 21, 2019, 4:00 amQuote from alienn on November 21, 2019, 4:00 amOf course.
Here is the content of the decompile crushmap:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54
# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 15 osd.15 class hdd
device 16 osd.16 class hdd
device 17 osd.17 class hdd
device 18 osd.18 class hdd
device 19 osd.19 class hdd
device 21 osd.21 class hdd
device 22 osd.22 class hdd
device 23 osd.23 class hdd
device 24 osd.24 class hdd
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root
# buckets
host ceph03 {
id -3 # do not change unnecessarily
id -4 class hdd # do not change unnecessarily
id -9 class ssd # do not change unnecessarily
# weight 72.768
alg straw2
hash 0 # rjenkins1
item osd.0 weight 9.096
item osd.1 weight 9.096
item osd.2 weight 9.096
item osd.3 weight 9.096
item osd.4 weight 9.096
item osd.5 weight 9.096
item osd.6 weight 9.096
item osd.7 weight 9.096
}
host ceph01 {
id -5 # do not change unnecessarily
id -6 class hdd # do not change unnecessarily
id -10 class ssd # do not change unnecessarily
# weight 72.768
alg straw2
hash 0 # rjenkins1
item osd.8 weight 9.096
item osd.9 weight 9.096
item osd.10 weight 9.096
item osd.11 weight 9.096
item osd.12 weight 9.096
item osd.13 weight 9.096
item osd.14 weight 9.096
item osd.15 weight 9.096
}
host ceph02 {
id -7 # do not change unnecessarily
id -8 class hdd # do not change unnecessarily
id -11 class ssd # do not change unnecessarily
# weight 72.768
alg straw2
hash 0 # rjenkins1
item osd.16 weight 9.096
item osd.17 weight 9.096
item osd.18 weight 9.096
item osd.19 weight 9.096
item osd.21 weight 9.096
item osd.22 weight 9.096
item osd.23 weight 9.096
item osd.24 weight 9.096
}
root default {
id -1 # do not change unnecessarily
id -2 class hdd # do not change unnecessarily
id -12 class ssd # do not change unnecessarily
# weight 218.304
alg straw2
hash 0 # rjenkins1
item ceph03 weight 72.768
item ceph01 weight 72.768
item ceph02 weight 72.768
}
# rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule ec-by-host-hdd {
id 1
type erasure
min_size 3
max_size 20
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default class hdd
step chooseleaf indep 0 type host
step emit
}
rule ec-by-host-ssd {
id 2
type erasure
min_size 3
max_size 20
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default class ssd
step chooseleaf indep 0 type host
step emit
}
rule by-host-ssd {
id 3
type replicated
min_size 1
max_size 10
step take default class ssd
step chooseleaf firstn 0 type host
step emit
}
rule by-host-hdd {
id 4
type replicated
min_size 1
max_size 10
step take default class hdd
step chooseleaf firstn 0 type host
step emit
}
# end crush map
Of course.
Here is the content of the decompile crushmap:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54
# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 15 osd.15 class hdd
device 16 osd.16 class hdd
device 17 osd.17 class hdd
device 18 osd.18 class hdd
device 19 osd.19 class hdd
device 21 osd.21 class hdd
device 22 osd.22 class hdd
device 23 osd.23 class hdd
device 24 osd.24 class hdd
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root
# buckets
host ceph03 {
id -3 # do not change unnecessarily
id -4 class hdd # do not change unnecessarily
id -9 class ssd # do not change unnecessarily
# weight 72.768
alg straw2
hash 0 # rjenkins1
item osd.0 weight 9.096
item osd.1 weight 9.096
item osd.2 weight 9.096
item osd.3 weight 9.096
item osd.4 weight 9.096
item osd.5 weight 9.096
item osd.6 weight 9.096
item osd.7 weight 9.096
}
host ceph01 {
id -5 # do not change unnecessarily
id -6 class hdd # do not change unnecessarily
id -10 class ssd # do not change unnecessarily
# weight 72.768
alg straw2
hash 0 # rjenkins1
item osd.8 weight 9.096
item osd.9 weight 9.096
item osd.10 weight 9.096
item osd.11 weight 9.096
item osd.12 weight 9.096
item osd.13 weight 9.096
item osd.14 weight 9.096
item osd.15 weight 9.096
}
host ceph02 {
id -7 # do not change unnecessarily
id -8 class hdd # do not change unnecessarily
id -11 class ssd # do not change unnecessarily
# weight 72.768
alg straw2
hash 0 # rjenkins1
item osd.16 weight 9.096
item osd.17 weight 9.096
item osd.18 weight 9.096
item osd.19 weight 9.096
item osd.21 weight 9.096
item osd.22 weight 9.096
item osd.23 weight 9.096
item osd.24 weight 9.096
}
root default {
id -1 # do not change unnecessarily
id -2 class hdd # do not change unnecessarily
id -12 class ssd # do not change unnecessarily
# weight 218.304
alg straw2
hash 0 # rjenkins1
item ceph03 weight 72.768
item ceph01 weight 72.768
item ceph02 weight 72.768
}
# rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule ec-by-host-hdd {
id 1
type erasure
min_size 3
max_size 20
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default class hdd
step chooseleaf indep 0 type host
step emit
}
rule ec-by-host-ssd {
id 2
type erasure
min_size 3
max_size 20
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default class ssd
step chooseleaf indep 0 type host
step emit
}
rule by-host-ssd {
id 3
type replicated
min_size 1
max_size 10
step take default class ssd
step chooseleaf firstn 0 type host
step emit
}
rule by-host-hdd {
id 4
type replicated
min_size 1
max_size 10
step take default class hdd
step chooseleaf firstn 0 type host
step emit
}
# end crush map
admin
2,930 Posts
November 21, 2019, 10:10 pmQuote from admin on November 21, 2019, 10:10 pmEdit the text file and remove the new rules that deal with classes, leaving only the default replicated_rule, i presume you only have 1 pool using this default rule. You can re-add the new rules later.
# recompile the edited file
crushtool -c crushmap-orig.txt -o crushmap-orig.bin
# convert
crushtool -i crushmap-orig.bin --reclassify --reclassify-root default hdd -o crushmap-new.bin
classify_root default (-1) as hdd
renumbering bucket -1 -> -13
renumbering bucket -7 -> -14
renumbering bucket -5 -> -15
renumbering bucket -3 -> -16
Edit the text file and remove the new rules that deal with classes, leaving only the default replicated_rule, i presume you only have 1 pool using this default rule. You can re-add the new rules later.
# recompile the edited file
crushtool -c crushmap-orig.txt -o crushmap-orig.bin
# convert
crushtool -i crushmap-orig.bin --reclassify --reclassify-root default hdd -o crushmap-new.bin
classify_root default (-1) as hdd
renumbering bucket -1 -> -13
renumbering bucket -7 -> -14
renumbering bucket -5 -> -15
renumbering bucket -3 -> -16
Last edited on November 21, 2019, 10:11 pm by admin · #10
Pages: 1 2
Understanding/Using Pools
protocol6v
85 Posts
Quote from protocol6v on January 15, 2019, 2:55 pmJust starting to play with 2.2, trying to understand the best use of pools.
I have an existing hard drive based cluster of 4 hosts each with 24 spinners and I have 4 additional hosts I'm looking to spin up, using SSDs and/or mixed 15k drives. Would I be best suited to add these new hosts to the existing cluster, and using pools or ec rules to place iSCSI disks on the SSD or faster spinners? Or should I create a whole new deployment?
Thanks!
Just starting to play with 2.2, trying to understand the best use of pools.
I have an existing hard drive based cluster of 4 hosts each with 24 spinners and I have 4 additional hosts I'm looking to spin up, using SSDs and/or mixed 15k drives. Would I be best suited to add these new hosts to the existing cluster, and using pools or ec rules to place iSCSI disks on the SSD or faster spinners? Or should I create a whole new deployment?
Thanks!
admin
2,930 Posts
Quote from admin on January 15, 2019, 3:19 pmMaybe add a new cluster. The thing is even if you define new crush rules for your new disks, the current crush rule of existing rbd pool "replicated_rule" will happily use the new disks as well and they will be serving both pools. You can change the rule for the existing pool but this will cause re-moving most of the existing data, it is really up to you.
Maybe add a new cluster. The thing is even if you define new crush rules for your new disks, the current crush rule of existing rbd pool "replicated_rule" will happily use the new disks as well and they will be serving both pools. You can change the rule for the existing pool but this will cause re-moving most of the existing data, it is really up to you.
protocol6v
85 Posts
Quote from protocol6v on January 15, 2019, 3:32 pmTo move forward, would I be best to get the burden of moving the data out of the way so I don't end up with a bunch of clusters to manage?
What combination of pools, rules, buckets would I need to effectively keep one iscsi disk on 7200, one on 15k and one on ssd?
Thanks!
To move forward, would I be best to get the burden of moving the data out of the way so I don't end up with a bunch of clusters to manage?
What combination of pools, rules, buckets would I need to effectively keep one iscsi disk on 7200, one on 15k and one on ssd?
Thanks!
admin
2,930 Posts
Quote from admin on January 15, 2019, 3:58 pmThe pre-defined template rule allow you to differentiate between ssds and hdds, all you have to do is use define pools with these rules.
To define different pools for hdds of different speeds, it becomes a bit more complex, One way is to place the different disk types in different hosts and create a rack (or row/room) called rack_15k and rack_7k to place nodes under, then you will have to write your won custom rules to do data placement on this upper rack. There is also another method to define your own device type apart from hdd and ssd,but i would not recommend it and it will probably mess with PetaSAN.
The pre-defined template rule allow you to differentiate between ssds and hdds, all you have to do is use define pools with these rules.
To define different pools for hdds of different speeds, it becomes a bit more complex, One way is to place the different disk types in different hosts and create a rack (or row/room) called rack_15k and rack_7k to place nodes under, then you will have to write your won custom rules to do data placement on this upper rack. There is also another method to define your own device type apart from hdd and ssd,but i would not recommend it and it will probably mess with PetaSAN.
alienn
37 Posts
Quote from alienn on November 6, 2019, 4:23 pmQuote from admin on January 15, 2019, 3:19 pmMaybe add a new cluster. The thing is even if you define new crush rules for your new disks, the current crush rule of existing rbd pool "replicated_rule" will happily use the new disks as well and they will be serving both pools. You can change the rule for the existing pool but this will cause re-moving most of the existing data, it is really up to you.
Sorry for reviving this old thread...
Right now I have one pool using the replicated_rule. Right now only hdds are used for the pool.
When I change the rule from replicated_rule to by-host-hdd rule (created from the template without modification) will there really be a lot of data movement? De facto nothing would change for the existing data as they are already on the only available hdds... 🙂 Or did I miss something?
I plan on adding some ssds to the nodes and create a second pool using the rule by-host-ssd for faster storage.
Can I create a rule that only targets nvme ssds? What would be the criteria?
Quote from admin on January 15, 2019, 3:19 pmMaybe add a new cluster. The thing is even if you define new crush rules for your new disks, the current crush rule of existing rbd pool "replicated_rule" will happily use the new disks as well and they will be serving both pools. You can change the rule for the existing pool but this will cause re-moving most of the existing data, it is really up to you.
Sorry for reviving this old thread...
Right now I have one pool using the replicated_rule. Right now only hdds are used for the pool.
When I change the rule from replicated_rule to by-host-hdd rule (created from the template without modification) will there really be a lot of data movement? De facto nothing would change for the existing data as they are already on the only available hdds... 🙂 Or did I miss something?
I plan on adding some ssds to the nodes and create a second pool using the rule by-host-ssd for faster storage.
Can I create a rule that only targets nvme ssds? What would be the criteria?
admin
2,930 Posts
Quote from admin on November 6, 2019, 8:53 pmby default Ceph does not classify nvme as ssds, if you want to separate them you need to tag the devices with a custom device class you create which you can name nvme.
to re-define a new crush tree with preserving existing device ids, see
https://docs.ceph.com/docs/master/rados/operations/crush-map-edits/#crush-reclassify
by default Ceph does not classify nvme as ssds, if you want to separate them you need to tag the devices with a custom device class you create which you can name nvme.
to re-define a new crush tree with preserving existing device ids, see
https://docs.ceph.com/docs/master/rados/operations/crush-map-edits/#crush-reclassify
alienn
37 Posts
Quote from alienn on November 20, 2019, 10:42 amAccording to the link you posted I'll be doing the following command on one of the three nodes:
ceph osd getcrushmap -o original # Export existing crushmap crushtool -i original --reclassify --reclassify-root default hdd -o adjusted # Move all esxisting osds to device class hdd crushtool -i original --compare adjusted # compare existing and new crushmap before activating ceph osd setcrushmap -i adjusted # set new crushmapI fail at the stept where it should modify the crushmap. Here is what I get:
$ crushtool -i original --reclassify --reclassify-root default hdd -o adjusted classify_root default (-1) as hdd rule 1 includes take on root default class 0 failed to reclassify mapThis is my existing crush tree (ceph osd crush tree --show-shadow):
ID CLASS WEIGHT TYPE NAME -12 ssd 0 root default~ssd -10 ssd 0 host ceph01~ssd -11 ssd 0 host ceph02~ssd -9 ssd 0 host ceph03~ssd -2 hdd 218.30383 root default~hdd -6 hdd 72.76794 host ceph01~hdd 8 hdd 9.09599 osd.8 9 hdd 9.09599 osd.9 10 hdd 9.09599 osd.10 11 hdd 9.09599 osd.11 12 hdd 9.09599 osd.12 13 hdd 9.09599 osd.13 14 hdd 9.09599 osd.14 15 hdd 9.09599 osd.15 -8 hdd 72.76794 host ceph02~hdd 16 hdd 9.09599 osd.16 17 hdd 9.09599 osd.17 18 hdd 9.09599 osd.18 19 hdd 9.09599 osd.19 21 hdd 9.09599 osd.21 22 hdd 9.09599 osd.22 23 hdd 9.09599 osd.23 24 hdd 9.09599 osd.24 -4 hdd 72.76794 host ceph03~hdd 0 hdd 9.09599 osd.0 1 hdd 9.09599 osd.1 2 hdd 9.09599 osd.2 3 hdd 9.09599 osd.3 4 hdd 9.09599 osd.4 5 hdd 9.09599 osd.5 6 hdd 9.09599 osd.6 7 hdd 9.09599 osd.7 -1 218.30383 root default -5 72.76794 host ceph01 8 hdd 9.09599 osd.8 9 hdd 9.09599 osd.9 10 hdd 9.09599 osd.10 11 hdd 9.09599 osd.11 12 hdd 9.09599 osd.12 13 hdd 9.09599 osd.13 14 hdd 9.09599 osd.14 15 hdd 9.09599 osd.15 -7 72.76794 host ceph02 16 hdd 9.09599 osd.16 17 hdd 9.09599 osd.17 18 hdd 9.09599 osd.18 19 hdd 9.09599 osd.19 21 hdd 9.09599 osd.21 22 hdd 9.09599 osd.22 23 hdd 9.09599 osd.23 24 hdd 9.09599 osd.24 -3 72.76794 host ceph03 0 hdd 9.09599 osd.0 1 hdd 9.09599 osd.1 2 hdd 9.09599 osd.2 3 hdd 9.09599 osd.3 4 hdd 9.09599 osd.4 5 hdd 9.09599 osd.5 6 hdd 9.09599 osd.6 7 hdd 9.09599 osd.7I just want to move my one and only existing pool (ceph-replicated-3) from the default rule "replicated_rule" to the rule "by-host-hdd". And I hope I can achieve this without too much data movement. As far as may research show I can could do `
ceph osd pool set ceph-replicated-3 crush_rule by-host-hhd`
but that would move a lot of data unnecessarily arround, right?
According to the link you posted I'll be doing the following command on one of the three nodes:
ceph osd getcrushmap -o original # Export existing crushmap crushtool -i original --reclassify --reclassify-root default hdd -o adjusted # Move all esxisting osds to device class hdd crushtool -i original --compare adjusted # compare existing and new crushmap before activating ceph osd setcrushmap -i adjusted # set new crushmap
I fail at the stept where it should modify the crushmap. Here is what I get:
$ crushtool -i original --reclassify --reclassify-root default hdd -o adjusted classify_root default (-1) as hdd rule 1 includes take on root default class 0 failed to reclassify map
This is my existing crush tree (ceph osd crush tree --show-shadow):
ID CLASS WEIGHT TYPE NAME -12 ssd 0 root default~ssd -10 ssd 0 host ceph01~ssd -11 ssd 0 host ceph02~ssd -9 ssd 0 host ceph03~ssd -2 hdd 218.30383 root default~hdd -6 hdd 72.76794 host ceph01~hdd 8 hdd 9.09599 osd.8 9 hdd 9.09599 osd.9 10 hdd 9.09599 osd.10 11 hdd 9.09599 osd.11 12 hdd 9.09599 osd.12 13 hdd 9.09599 osd.13 14 hdd 9.09599 osd.14 15 hdd 9.09599 osd.15 -8 hdd 72.76794 host ceph02~hdd 16 hdd 9.09599 osd.16 17 hdd 9.09599 osd.17 18 hdd 9.09599 osd.18 19 hdd 9.09599 osd.19 21 hdd 9.09599 osd.21 22 hdd 9.09599 osd.22 23 hdd 9.09599 osd.23 24 hdd 9.09599 osd.24 -4 hdd 72.76794 host ceph03~hdd 0 hdd 9.09599 osd.0 1 hdd 9.09599 osd.1 2 hdd 9.09599 osd.2 3 hdd 9.09599 osd.3 4 hdd 9.09599 osd.4 5 hdd 9.09599 osd.5 6 hdd 9.09599 osd.6 7 hdd 9.09599 osd.7 -1 218.30383 root default -5 72.76794 host ceph01 8 hdd 9.09599 osd.8 9 hdd 9.09599 osd.9 10 hdd 9.09599 osd.10 11 hdd 9.09599 osd.11 12 hdd 9.09599 osd.12 13 hdd 9.09599 osd.13 14 hdd 9.09599 osd.14 15 hdd 9.09599 osd.15 -7 72.76794 host ceph02 16 hdd 9.09599 osd.16 17 hdd 9.09599 osd.17 18 hdd 9.09599 osd.18 19 hdd 9.09599 osd.19 21 hdd 9.09599 osd.21 22 hdd 9.09599 osd.22 23 hdd 9.09599 osd.23 24 hdd 9.09599 osd.24 -3 72.76794 host ceph03 0 hdd 9.09599 osd.0 1 hdd 9.09599 osd.1 2 hdd 9.09599 osd.2 3 hdd 9.09599 osd.3 4 hdd 9.09599 osd.4 5 hdd 9.09599 osd.5 6 hdd 9.09599 osd.6 7 hdd 9.09599 osd.7
I just want to move my one and only existing pool (ceph-replicated-3) from the default rule "replicated_rule" to the rule "by-host-hdd". And I hope I can achieve this without too much data movement. As far as may research show I can could do `ceph osd pool set ceph-replicated-3 crush_rule by-host-hhd`
but that would move a lot of data unnecessarily arround, right?
admin
2,930 Posts
Quote from admin on November 20, 2019, 9:38 pmcan you decompile your original crush and post the text here
can you decompile your original crush and post the text here
alienn
37 Posts
Quote from alienn on November 21, 2019, 4:00 amOf course.
Here is the content of the decompile crushmap:
# begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable chooseleaf_vary_r 1 tunable chooseleaf_stable 1 tunable straw_calc_version 1 tunable allowed_bucket_algs 54 # devices device 0 osd.0 class hdd device 1 osd.1 class hdd device 2 osd.2 class hdd device 3 osd.3 class hdd device 4 osd.4 class hdd device 5 osd.5 class hdd device 6 osd.6 class hdd device 7 osd.7 class hdd device 8 osd.8 class hdd device 9 osd.9 class hdd device 10 osd.10 class hdd device 11 osd.11 class hdd device 12 osd.12 class hdd device 13 osd.13 class hdd device 14 osd.14 class hdd device 15 osd.15 class hdd device 16 osd.16 class hdd device 17 osd.17 class hdd device 18 osd.18 class hdd device 19 osd.19 class hdd device 21 osd.21 class hdd device 22 osd.22 class hdd device 23 osd.23 class hdd device 24 osd.24 class hdd # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host ceph03 { id -3 # do not change unnecessarily id -4 class hdd # do not change unnecessarily id -9 class ssd # do not change unnecessarily # weight 72.768 alg straw2 hash 0 # rjenkins1 item osd.0 weight 9.096 item osd.1 weight 9.096 item osd.2 weight 9.096 item osd.3 weight 9.096 item osd.4 weight 9.096 item osd.5 weight 9.096 item osd.6 weight 9.096 item osd.7 weight 9.096 } host ceph01 { id -5 # do not change unnecessarily id -6 class hdd # do not change unnecessarily id -10 class ssd # do not change unnecessarily # weight 72.768 alg straw2 hash 0 # rjenkins1 item osd.8 weight 9.096 item osd.9 weight 9.096 item osd.10 weight 9.096 item osd.11 weight 9.096 item osd.12 weight 9.096 item osd.13 weight 9.096 item osd.14 weight 9.096 item osd.15 weight 9.096 } host ceph02 { id -7 # do not change unnecessarily id -8 class hdd # do not change unnecessarily id -11 class ssd # do not change unnecessarily # weight 72.768 alg straw2 hash 0 # rjenkins1 item osd.16 weight 9.096 item osd.17 weight 9.096 item osd.18 weight 9.096 item osd.19 weight 9.096 item osd.21 weight 9.096 item osd.22 weight 9.096 item osd.23 weight 9.096 item osd.24 weight 9.096 } root default { id -1 # do not change unnecessarily id -2 class hdd # do not change unnecessarily id -12 class ssd # do not change unnecessarily # weight 218.304 alg straw2 hash 0 # rjenkins1 item ceph03 weight 72.768 item ceph01 weight 72.768 item ceph02 weight 72.768 } # rules rule replicated_rule { id 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } rule ec-by-host-hdd { id 1 type erasure min_size 3 max_size 20 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default class hdd step chooseleaf indep 0 type host step emit } rule ec-by-host-ssd { id 2 type erasure min_size 3 max_size 20 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default class ssd step chooseleaf indep 0 type host step emit } rule by-host-ssd { id 3 type replicated min_size 1 max_size 10 step take default class ssd step chooseleaf firstn 0 type host step emit } rule by-host-hdd { id 4 type replicated min_size 1 max_size 10 step take default class hdd step chooseleaf firstn 0 type host step emit } # end crush map
Of course.
Here is the content of the decompile crushmap:
# begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable chooseleaf_vary_r 1 tunable chooseleaf_stable 1 tunable straw_calc_version 1 tunable allowed_bucket_algs 54 # devices device 0 osd.0 class hdd device 1 osd.1 class hdd device 2 osd.2 class hdd device 3 osd.3 class hdd device 4 osd.4 class hdd device 5 osd.5 class hdd device 6 osd.6 class hdd device 7 osd.7 class hdd device 8 osd.8 class hdd device 9 osd.9 class hdd device 10 osd.10 class hdd device 11 osd.11 class hdd device 12 osd.12 class hdd device 13 osd.13 class hdd device 14 osd.14 class hdd device 15 osd.15 class hdd device 16 osd.16 class hdd device 17 osd.17 class hdd device 18 osd.18 class hdd device 19 osd.19 class hdd device 21 osd.21 class hdd device 22 osd.22 class hdd device 23 osd.23 class hdd device 24 osd.24 class hdd # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host ceph03 { id -3 # do not change unnecessarily id -4 class hdd # do not change unnecessarily id -9 class ssd # do not change unnecessarily # weight 72.768 alg straw2 hash 0 # rjenkins1 item osd.0 weight 9.096 item osd.1 weight 9.096 item osd.2 weight 9.096 item osd.3 weight 9.096 item osd.4 weight 9.096 item osd.5 weight 9.096 item osd.6 weight 9.096 item osd.7 weight 9.096 } host ceph01 { id -5 # do not change unnecessarily id -6 class hdd # do not change unnecessarily id -10 class ssd # do not change unnecessarily # weight 72.768 alg straw2 hash 0 # rjenkins1 item osd.8 weight 9.096 item osd.9 weight 9.096 item osd.10 weight 9.096 item osd.11 weight 9.096 item osd.12 weight 9.096 item osd.13 weight 9.096 item osd.14 weight 9.096 item osd.15 weight 9.096 } host ceph02 { id -7 # do not change unnecessarily id -8 class hdd # do not change unnecessarily id -11 class ssd # do not change unnecessarily # weight 72.768 alg straw2 hash 0 # rjenkins1 item osd.16 weight 9.096 item osd.17 weight 9.096 item osd.18 weight 9.096 item osd.19 weight 9.096 item osd.21 weight 9.096 item osd.22 weight 9.096 item osd.23 weight 9.096 item osd.24 weight 9.096 } root default { id -1 # do not change unnecessarily id -2 class hdd # do not change unnecessarily id -12 class ssd # do not change unnecessarily # weight 218.304 alg straw2 hash 0 # rjenkins1 item ceph03 weight 72.768 item ceph01 weight 72.768 item ceph02 weight 72.768 } # rules rule replicated_rule { id 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } rule ec-by-host-hdd { id 1 type erasure min_size 3 max_size 20 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default class hdd step chooseleaf indep 0 type host step emit } rule ec-by-host-ssd { id 2 type erasure min_size 3 max_size 20 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default class ssd step chooseleaf indep 0 type host step emit } rule by-host-ssd { id 3 type replicated min_size 1 max_size 10 step take default class ssd step chooseleaf firstn 0 type host step emit } rule by-host-hdd { id 4 type replicated min_size 1 max_size 10 step take default class hdd step chooseleaf firstn 0 type host step emit } # end crush map
admin
2,930 Posts
Quote from admin on November 21, 2019, 10:10 pmEdit the text file and remove the new rules that deal with classes, leaving only the default replicated_rule, i presume you only have 1 pool using this default rule. You can re-add the new rules later.
# recompile the edited file
crushtool -c crushmap-orig.txt -o crushmap-orig.bin
# convert
crushtool -i crushmap-orig.bin --reclassify --reclassify-root default hdd -o crushmap-new.bin
classify_root default (-1) as hdd
renumbering bucket -1 -> -13
renumbering bucket -7 -> -14
renumbering bucket -5 -> -15
renumbering bucket -3 -> -16
Edit the text file and remove the new rules that deal with classes, leaving only the default replicated_rule, i presume you only have 1 pool using this default rule. You can re-add the new rules later.
# recompile the edited file
crushtool -c crushmap-orig.txt -o crushmap-orig.bin
# convert
crushtool -i crushmap-orig.bin --reclassify --reclassify-root default hdd -o crushmap-new.bin
classify_root default (-1) as hdd
renumbering bucket -1 -> -13
renumbering bucket -7 -> -14
renumbering bucket -5 -> -15
renumbering bucket -3 -> -16