ForumGeneral DiscussionUnderstanding/Using Pools

You need to log in to create posts and topics. Login · Register

Understanding/Using Pools

protocol6v
85 Posts

January 15, 2019, 2:55 pm

Just starting to play with 2.2, trying to understand the best use of pools.

I have an existing hard drive based cluster of 4 hosts each with 24 spinners and I have 4 additional hosts I'm looking to spin up, using SSDs and/or mixed 15k drives. Would I be best suited to add these new hosts to the existing cluster, and using pools or ec rules to place iSCSI disks on the SSD or faster spinners? Or should I create a whole new deployment?

Thanks!

admin
2,969 Posts

January 15, 2019, 3:19 pm

Maybe add a new cluster. The thing is even if you define new crush rules for your new disks, the current crush rule of existing rbd pool "replicated_rule" will happily use the new disks as well and they will be serving both pools. You can change the rule for the existing pool but this will cause re-moving most of the existing data, it is really up to you.

protocol6v
85 Posts

January 15, 2019, 3:32 pm

To move forward, would I be best to get the burden of moving the data out of the way so I don't end up with a bunch of clusters to manage?

What combination of pools, rules, buckets would I need to effectively keep one iscsi disk on 7200, one on 15k and one on ssd?

Thanks!

admin
2,969 Posts

January 15, 2019, 3:58 pm

The pre-defined template rule allow you to differentiate between ssds and hdds, all you have to do is use define pools with these rules.

To define different pools for hdds of different speeds, it becomes a bit more complex, One way is to place the different disk types in different hosts and create a rack (or row/room) called rack_15k and rack_7k to place nodes under, then you will have to write your won custom rules to do data placement on this upper rack. There is also another method to define your own device type apart from hdd and ssd,but i would not recommend it and it will probably mess with PetaSAN.

alienn
37 Posts

November 6, 2019, 4:23 pm

Quote from admin on January 15, 2019, 3:19 pm

Maybe add a new cluster. The thing is even if you define new crush rules for your new disks, the current crush rule of existing rbd pool "replicated_rule" will happily use the new disks as well and they will be serving both pools. You can change the rule for the existing pool but this will cause re-moving most of the existing data, it is really up to you.

Sorry for reviving this old thread...

Right now I have one pool using the replicated_rule. Right now only hdds are used for the pool.

When I change the rule from replicated_rule to by-host-hdd rule (created from the template without modification) will there really be a lot of data movement? De facto nothing would change for the existing data as they are already on the only available hdds... 🙂 Or did I miss something?

I plan on adding some ssds to the nodes and create a second pool using the rule by-host-ssd for faster storage.

Can I create a rule that only targets nvme ssds? What would be the criteria?

admin
2,969 Posts

November 6, 2019, 8:53 pm

by default Ceph does not classify nvme as ssds, if you want to separate them you need to tag the devices with a custom device class you create which you can name nvme.

to re-define a new crush tree with preserving existing device ids, see

https://docs.ceph.com/docs/master/rados/operations/crush-map-edits/#crush-reclassify

alienn
37 Posts

November 20, 2019, 10:42 am

Quote from alienn on November 20, 2019, 10:42 am

According to the link you posted I'll be doing the following command on one of the three nodes:

ceph osd getcrushmap -o original # Export existing crushmap
crushtool -i original --reclassify --reclassify-root default hdd -o adjusted # Move all esxisting osds to device class hdd
crushtool -i original --compare adjusted # compare existing and new crushmap before activating
ceph osd setcrushmap -i adjusted # set new crushmap

I fail at the stept where it should modify the crushmap. Here is what I get:

$ crushtool -i original --reclassify --reclassify-root default hdd -o adjusted
classify_root default (-1) as hdd
 rule 1 includes take on root default class 0
failed to reclassify map

This is my existing crush tree (ceph osd crush tree --show-shadow):

ID CLASS WEIGHT TYPE NAME
-12 ssd 0 root default~ssd
-10 ssd 0 host ceph01~ssd
-11 ssd 0 host ceph02~ssd
 -9 ssd 0 host ceph03~ssd
 -2 hdd 218.30383 root default~hdd
 -6 hdd 72.76794 host ceph01~hdd
 8 hdd 9.09599 osd.8
 9 hdd 9.09599 osd.9
 10 hdd 9.09599 osd.10
 11 hdd 9.09599 osd.11
 12 hdd 9.09599 osd.12
 13 hdd 9.09599 osd.13
 14 hdd 9.09599 osd.14
 15 hdd 9.09599 osd.15
 -8 hdd 72.76794 host ceph02~hdd
 16 hdd 9.09599 osd.16
 17 hdd 9.09599 osd.17
 18 hdd 9.09599 osd.18
 19 hdd 9.09599 osd.19
 21 hdd 9.09599 osd.21
 22 hdd 9.09599 osd.22
 23 hdd 9.09599 osd.23
 24 hdd 9.09599 osd.24
 -4 hdd 72.76794 host ceph03~hdd
 0 hdd 9.09599 osd.0
 1 hdd 9.09599 osd.1
 2 hdd 9.09599 osd.2
 3 hdd 9.09599 osd.3
 4 hdd 9.09599 osd.4
 5 hdd 9.09599 osd.5
 6 hdd 9.09599 osd.6
 7 hdd 9.09599 osd.7
 -1 218.30383 root default
 -5 72.76794 host ceph01
 8 hdd 9.09599 osd.8
 9 hdd 9.09599 osd.9
 10 hdd 9.09599 osd.10
 11 hdd 9.09599 osd.11
 12 hdd 9.09599 osd.12
 13 hdd 9.09599 osd.13
 14 hdd 9.09599 osd.14
 15 hdd 9.09599 osd.15
 -7 72.76794 host ceph02
 16 hdd 9.09599 osd.16
 17 hdd 9.09599 osd.17
 18 hdd 9.09599 osd.18
 19 hdd 9.09599 osd.19
 21 hdd 9.09599 osd.21
 22 hdd 9.09599 osd.22
 23 hdd 9.09599 osd.23
 24 hdd 9.09599 osd.24
 -3 72.76794 host ceph03
 0 hdd 9.09599 osd.0
 1 hdd 9.09599 osd.1
 2 hdd 9.09599 osd.2
 3 hdd 9.09599 osd.3
 4 hdd 9.09599 osd.4
 5 hdd 9.09599 osd.5
 6 hdd 9.09599 osd.6
 7 hdd 9.09599 osd.7

I just want to move my one and only existing pool (ceph-replicated-3) from the default rule "replicated_rule" to the rule "by-host-hdd". And I hope I can achieve this without too much data movement. As far as may research show I can could do `ceph osd pool set ceph-replicated-3 crush_rule by-host-hhd` but that would move a lot of data unnecessarily arround, right?

According to the link you posted I'll be doing the following command on one of the three nodes:

ceph osd getcrushmap -o original # Export existing crushmap
crushtool -i original --reclassify --reclassify-root default hdd -o adjusted # Move all esxisting osds to device class hdd
crushtool -i original --compare adjusted # compare existing and new crushmap before activating
ceph osd setcrushmap -i adjusted # set new crushmap

I fail at the stept where it should modify the crushmap. Here is what I get:

$ crushtool -i original --reclassify --reclassify-root default hdd -o adjusted
classify_root default (-1) as hdd
 rule 1 includes take on root default class 0
failed to reclassify map

This is my existing crush tree (ceph osd crush tree --show-shadow):

ID CLASS WEIGHT TYPE NAME
-12 ssd 0 root default~ssd
-10 ssd 0 host ceph01~ssd
-11 ssd 0 host ceph02~ssd
 -9 ssd 0 host ceph03~ssd
 -2 hdd 218.30383 root default~hdd
 -6 hdd 72.76794 host ceph01~hdd
 8 hdd 9.09599 osd.8
 9 hdd 9.09599 osd.9
 10 hdd 9.09599 osd.10
 11 hdd 9.09599 osd.11
 12 hdd 9.09599 osd.12
 13 hdd 9.09599 osd.13
 14 hdd 9.09599 osd.14
 15 hdd 9.09599 osd.15
 -8 hdd 72.76794 host ceph02~hdd
 16 hdd 9.09599 osd.16
 17 hdd 9.09599 osd.17
 18 hdd 9.09599 osd.18
 19 hdd 9.09599 osd.19
 21 hdd 9.09599 osd.21
 22 hdd 9.09599 osd.22
 23 hdd 9.09599 osd.23
 24 hdd 9.09599 osd.24
 -4 hdd 72.76794 host ceph03~hdd
 0 hdd 9.09599 osd.0
 1 hdd 9.09599 osd.1
 2 hdd 9.09599 osd.2
 3 hdd 9.09599 osd.3
 4 hdd 9.09599 osd.4
 5 hdd 9.09599 osd.5
 6 hdd 9.09599 osd.6
 7 hdd 9.09599 osd.7
 -1 218.30383 root default
 -5 72.76794 host ceph01
 8 hdd 9.09599 osd.8
 9 hdd 9.09599 osd.9
 10 hdd 9.09599 osd.10
 11 hdd 9.09599 osd.11
 12 hdd 9.09599 osd.12
 13 hdd 9.09599 osd.13
 14 hdd 9.09599 osd.14
 15 hdd 9.09599 osd.15
 -7 72.76794 host ceph02
 16 hdd 9.09599 osd.16
 17 hdd 9.09599 osd.17
 18 hdd 9.09599 osd.18
 19 hdd 9.09599 osd.19
 21 hdd 9.09599 osd.21
 22 hdd 9.09599 osd.22
 23 hdd 9.09599 osd.23
 24 hdd 9.09599 osd.24
 -3 72.76794 host ceph03
 0 hdd 9.09599 osd.0
 1 hdd 9.09599 osd.1
 2 hdd 9.09599 osd.2
 3 hdd 9.09599 osd.3
 4 hdd 9.09599 osd.4
 5 hdd 9.09599 osd.5
 6 hdd 9.09599 osd.6
 7 hdd 9.09599 osd.7

admin
2,969 Posts

November 20, 2019, 9:38 pm

can you decompile your original crush and post the text here

alienn
37 Posts

November 21, 2019, 4:00 am

Quote from alienn on November 21, 2019, 4:00 am

Of course.

Here is the content of the decompile crushmap:

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 15 osd.15 class hdd
device 16 osd.16 class hdd
device 17 osd.17 class hdd
device 18 osd.18 class hdd
device 19 osd.19 class hdd
device 21 osd.21 class hdd
device 22 osd.22 class hdd
device 23 osd.23 class hdd
device 24 osd.24 class hdd

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host ceph03 {
 id -3 # do not change unnecessarily
 id -4 class hdd # do not change unnecessarily
 id -9 class ssd # do not change unnecessarily
 # weight 72.768
 alg straw2
 hash 0 # rjenkins1
 item osd.0 weight 9.096
 item osd.1 weight 9.096
 item osd.2 weight 9.096
 item osd.3 weight 9.096
 item osd.4 weight 9.096
 item osd.5 weight 9.096
 item osd.6 weight 9.096
 item osd.7 weight 9.096
}
host ceph01 {
 id -5 # do not change unnecessarily
 id -6 class hdd # do not change unnecessarily
 id -10 class ssd # do not change unnecessarily
 # weight 72.768
 alg straw2
 hash 0 # rjenkins1
 item osd.8 weight 9.096
 item osd.9 weight 9.096
 item osd.10 weight 9.096
 item osd.11 weight 9.096
 item osd.12 weight 9.096
 item osd.13 weight 9.096
 item osd.14 weight 9.096
 item osd.15 weight 9.096
}
host ceph02 {
 id -7 # do not change unnecessarily
 id -8 class hdd # do not change unnecessarily
 id -11 class ssd # do not change unnecessarily
 # weight 72.768
 alg straw2
 hash 0 # rjenkins1
 item osd.16 weight 9.096
 item osd.17 weight 9.096
 item osd.18 weight 9.096
 item osd.19 weight 9.096
 item osd.21 weight 9.096
 item osd.22 weight 9.096
 item osd.23 weight 9.096
 item osd.24 weight 9.096
}
root default {
 id -1 # do not change unnecessarily
 id -2 class hdd # do not change unnecessarily
 id -12 class ssd # do not change unnecessarily
 # weight 218.304
 alg straw2
 hash 0 # rjenkins1
 item ceph03 weight 72.768
 item ceph01 weight 72.768
 item ceph02 weight 72.768
}

# rules
rule replicated_rule {
 id 0
 type replicated
 min_size 1
 max_size 10
 step take default
 step chooseleaf firstn 0 type host
 step emit
}
rule ec-by-host-hdd {
 id 1
 type erasure
 min_size 3
 max_size 20
 step set_chooseleaf_tries 5
 step set_choose_tries 100
 step take default class hdd
 step chooseleaf indep 0 type host
 step emit
}
rule ec-by-host-ssd {
 id 2
 type erasure
 min_size 3
 max_size 20
 step set_chooseleaf_tries 5
 step set_choose_tries 100
 step take default class ssd
 step chooseleaf indep 0 type host
 step emit
}
rule by-host-ssd {
 id 3
 type replicated
 min_size 1
 max_size 10
 step take default class ssd
 step chooseleaf firstn 0 type host
 step emit
}
rule by-host-hdd {
 id 4
 type replicated
 min_size 1
 max_size 10
 step take default class hdd
 step chooseleaf firstn 0 type host
 step emit
}

# end crush map

Of course.

Here is the content of the decompile crushmap:

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 15 osd.15 class hdd
device 16 osd.16 class hdd
device 17 osd.17 class hdd
device 18 osd.18 class hdd
device 19 osd.19 class hdd
device 21 osd.21 class hdd
device 22 osd.22 class hdd
device 23 osd.23 class hdd
device 24 osd.24 class hdd

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host ceph03 {
 id -3 # do not change unnecessarily
 id -4 class hdd # do not change unnecessarily
 id -9 class ssd # do not change unnecessarily
 # weight 72.768
 alg straw2
 hash 0 # rjenkins1
 item osd.0 weight 9.096
 item osd.1 weight 9.096
 item osd.2 weight 9.096
 item osd.3 weight 9.096
 item osd.4 weight 9.096
 item osd.5 weight 9.096
 item osd.6 weight 9.096
 item osd.7 weight 9.096
}
host ceph01 {
 id -5 # do not change unnecessarily
 id -6 class hdd # do not change unnecessarily
 id -10 class ssd # do not change unnecessarily
 # weight 72.768
 alg straw2
 hash 0 # rjenkins1
 item osd.8 weight 9.096
 item osd.9 weight 9.096
 item osd.10 weight 9.096
 item osd.11 weight 9.096
 item osd.12 weight 9.096
 item osd.13 weight 9.096
 item osd.14 weight 9.096
 item osd.15 weight 9.096
}
host ceph02 {
 id -7 # do not change unnecessarily
 id -8 class hdd # do not change unnecessarily
 id -11 class ssd # do not change unnecessarily
 # weight 72.768
 alg straw2
 hash 0 # rjenkins1
 item osd.16 weight 9.096
 item osd.17 weight 9.096
 item osd.18 weight 9.096
 item osd.19 weight 9.096
 item osd.21 weight 9.096
 item osd.22 weight 9.096
 item osd.23 weight 9.096
 item osd.24 weight 9.096
}
root default {
 id -1 # do not change unnecessarily
 id -2 class hdd # do not change unnecessarily
 id -12 class ssd # do not change unnecessarily
 # weight 218.304
 alg straw2
 hash 0 # rjenkins1
 item ceph03 weight 72.768
 item ceph01 weight 72.768
 item ceph02 weight 72.768
}

# rules
rule replicated_rule {
 id 0
 type replicated
 min_size 1
 max_size 10
 step take default
 step chooseleaf firstn 0 type host
 step emit
}
rule ec-by-host-hdd {
 id 1
 type erasure
 min_size 3
 max_size 20
 step set_chooseleaf_tries 5
 step set_choose_tries 100
 step take default class hdd
 step chooseleaf indep 0 type host
 step emit
}
rule ec-by-host-ssd {
 id 2
 type erasure
 min_size 3
 max_size 20
 step set_chooseleaf_tries 5
 step set_choose_tries 100
 step take default class ssd
 step chooseleaf indep 0 type host
 step emit
}
rule by-host-ssd {
 id 3
 type replicated
 min_size 1
 max_size 10
 step take default class ssd
 step chooseleaf firstn 0 type host
 step emit
}
rule by-host-hdd {
 id 4
 type replicated
 min_size 1
 max_size 10
 step take default class hdd
 step chooseleaf firstn 0 type host
 step emit
}

# end crush map

admin
2,969 Posts

November 21, 2019, 10:10 pm

Edit the text file and remove the new rules that deal with classes, leaving only the default replicated_rule, i presume you only have 1 pool using this default rule. You can re-add the new rules later.

# recompile the edited file

crushtool -c crushmap-orig.txt -o crushmap-orig.bin

# convert

crushtool -i crushmap-orig.bin --reclassify --reclassify-root default hdd -o crushmap-new.bin
classify_root default (-1) as hdd
renumbering bucket -1 -> -13
renumbering bucket -7 -> -14
renumbering bucket -5 -> -15
renumbering bucket -3 -> -16